CMSC 12200 - Structs in C and bridging with Python using CTypes

This lab was cancelled due to the blizzard. We may come back to pieces of it later.

Goals

Today has two major lessons.
  1. We will acclimate ourselves with Structs and pointers to structs in C. We'll directly compare this to object oriented programming in Java and Python.
  2. We'll use CTypes to make fast functions written in C accessible within Python.

C Structs

We often want to bundle variables into a single entity. In Java we used classes to accomplish this. In C we'll use Structs.

Structs are exactly like classes except that they do not have methods built in. They are only a collection of variables. Sample syntax to define a rectangle struct is as follows

struct Rect
{
  double width;
  double height;
}; //do not forget the semi-colon!
This syntax is similar to defining classes in Java although much less verbose (yay!!!). Structs can become large and so general practice is not to pass them around as arguments to functions. Instead, we almost exclusively pass around pointers to structs. Because we're acting dynamically we'll need to use malloc.
struct Rect * myrect; //a pointer to a struct Rect 
myrect = malloc(sizeof(struct Rect)); //allocate enough bytes for one "struct Rect". Return a pointer of type "struct Rect *"
myrect->width = 5;  //use -> to access and set fields 
myrect->height = 3;
We defined a pointer to a struct Rect. Allocated enough space to store one. And then set the fields of the structure using the arrow "->" syntax. This process is identical to what happens in Java when we say
//JAVA CODE
Rect myrect; //Define a pointer
myrect = new Rect(); //Allocate space with "new" instead of "malloc"
myrect.width = 5;
myrect.height = 3;
It is often advisable to make a constructor function to do the allocating for us. An example for struct Rects would be
 
struct Rect * mkRect(double width, double height) //"make Rect" constructor
{
  struct Rect *myRect = malloc(sizeof(struct Rect));
  myRect->width = width;
  myRect->height = height;
  return myRect;
}
Finally, typing "struct Rect *" over and over again is becoming annoying. In C we use typedef to define and rename types to make things cleaner. The syntax is typedef <old stuff> <new word>. The constructor mkRect above can be handled more easily as follows.
typedef struct Rect * RectP; //define "RectP" to be "struct Rect *". RectP stands for Rect Pointer
RectP mkRect(double width, double height)
{
  RectP myRect = malloc(sizeof(struct Rect));
  myRect->width = width;
  myRect->height = height;
  return myRect;
}
Now, the type RectP is defined to be a pointer to a struct Rect. We could use typedef for other purposes too. We could define "D" to stand for "double" and I to stand for "int" if we desired.
typedef double D;
typedef int I;
//enables us to write
D d1,d2; //d1 and d2 are doubles 
d1 = 5.3; d2 = 4.4;
By looking at pointers to structs and by making constructors we've almost entirely reinvented Java objects using C structs. The only thing left is to reinvent methods. In Java objects like Rects were imbued with methods like Area(). In C we handle this by passing the object as the first argument
double Area(RectP myrect)
{
  return myrect->width * myrect->height;
}
This function does not "belong" to the rect as methods did in Java. It's just a normal function. With this solution to methods we've completely redefined Java objects for simple cases. This code is assembled in Rect.c.

Recreating Binary Search Trees

Two weeks ago in lab we built Binary Search Trees in Java. This week we're going to quickly rebuild them in C and then test them from Python. In rebuilding them many of the differences between C, Java, and Python will become apparent.

Check out, either through command line ("svn up") or though Eclipse ("File:Import"), the folder lab-cstructs. This folder contains the following:

Define and build a binary search tree struct in C. You should build at least the following:

All of these functions are written and well-commented in Python in tree.py and you wrote many of them in last week's lab in Java.

Connecting C and Python

In rewriting BST code in C hopefully you saw some substantial differences. Pointers and memory allocation are more explicit. Objects/Structs aren't imbued with methods. Some small syntax changes like '.' vs. '->', etc.... Hopefully you also found that all three languages can look and act very similarly. My versions of tree.py and tree.c are very similar to each other.

For me the major differences between C and Python is that C is significantly faster while Python is significantly more convenient to work with, especially with external code or libraries (i.e. plotting libraries like matplotlib/pylab in Python are very accessible while their counterparts in C are more challenging).

You can get the best of both worlds by linking fast C functions into convenient Python Code. There are a few ways to do this - today we'll use a Python module called CTypes.

CTypes allows us to pull compiled C functions from libraries and use them directly in Python. From the terminal inspect the folder ctypesexample in the lab-cstructs folder you downloaded. This contains a file, test.c, which contains the standard fibonacci function written in C. Compile this .c file into an executable and then run using:

 
gcc test.c -o test.exe
./test.exe
You should see that it computes and prints the value of fib(30).

Now compile test.c into a shared object file using

gcc -fPIC -shared test.c -o testLib.so
This creates a file, testLib.so which is not intended to be executed on it's own but instead can be used from other programs.

Look at and run test.py from within ipython. This file contains the syntax to load in the test.so library, extract the fib function. It also contains the same function written in the Python language. Confirm that both versions compute the same result. Run both functions on the value 30. Do you notice a lag in when you run fibPy?

Use the timeit utility in ipython to test the times of each function. timeit runs the function hundreds of times and computes the average time.

>>> timeit fibC(10)
>>> timeit fibPy(10)
Using the fibonacci test in test.py as an example perform this same process on your tree code. You'll need to compile tree.c from the commandline using the -fPIC and -shared flags. You'll need to load in the treeLib.so library and the functions you've created. Create two trees of 50000 nodes, one in Python and one in C. Using timeit compare the speed of your C functions to my Python functions.

At this point in the class it's not required that you know how to use CTypes. If you find this section challenging feel free to look at treeTest.py. It contains the code that I used to compare my Python and C implementations of binary search trees.