And, because stock trades happen extremely quickly these operations will need to be fast.ht = mkHashTable() addToHashTable(ht, 'MSFT', microsoftStockObject) addToHashTable(ht, 'GOOG', googleStockObject) msStock = getFromHashTable(ht, 'MSFT')
Concretely a Hash Table is an array of very short unsorted linked lists (like the image on the right) combined with a hash function.
def hash(string, maxVal): total = 0 prime = 31 # a prime number of my choice for letter in string: #go down the letters in the string total = total * prime + int(letter) #multiply the total by the prime, add the next letter value total = total % maxVal #clip the number at some maximum size return total
To Add an Element: Hash the key to obtain an integer. Use this integer as an index to the array. Add the element to the linked list stored at that index.
To Lookup a key: Hash the key to obtain an integer. Use this integer to find the appropriate linked list in the array. Search down this (short) linked list to find the element which has the correct key.
Sophisticated Hash Tables can resize themselves if they detect that they have too many or two few entries for their current size. We won't deal with this feature for this assignment.
In the ideal situation each "bucket" or entry in the Array would have one or two elements. In the worst situation one bucket has all the elements and the other buckets are null.
Make a new source file, hashTable.c (or whatever name you like). You'll build how hash tables work in this file. You may want to look at stock.c for an idea of how things are usually organized.
For your assignment you need to write a Hash Table to store stocks. You will use the stock's ticker symbol as the key. I.e. we should be able to ask the hash table to return the stock with ticker symbol (key) 'MSFT'.
You will need to write a very simple node struct to string stock_t's together.
You will need to write a simple-ish HashTable struct which is, as discussed above, simply an array of linked lists.
You will need to write a hash function from strings to numbers (an example is above but several others exist.) Make sure your function is relatively balanced between any range of numbers (we'll need to fit this range to the size of the array so that the indices match up). For the purposes of this assignment use a fixed-size hash table. Something in the tens of thousands is good for our purposes.
Then you'll need to write the logic to add and lookup stocks.
Write some very basic testing code in a separate file (see testorder.c or stockTest.c for examples). A basic test might be to make a few stocks by hand, throw them into the hash table and then make sure you can extract them. You can also test your hashTable code with a main() directly in the original .c file. The drawback is that when you try to use hashTable.c in some larger program there will be two main() functions which will confuse the computer.
As projects grow it's a good idea to separate work into smaller tasks. For large projects organization is probably the most important part of getting things done. Personally I (matt) hate organization, and as a result I've wasted a lot of my time tracking down very frustrating problems.
In your hw3 folder you'll notice that there are both .c and .h files. The .c files contain the logic of how things work. The .h files contain the information someone else needs in order to use your code. The definitions and functions contained in the .h file are like the public methods in Java. Other people shouldn't have to look through and understand all of your code, they should only have to read the .h file. It is more important to document the .h file than the .c file (although of course you should document both).
After you construct a working hash table decide what should go into the .h file. As an example the hash function probably should not but the ability to make and add to hash tables probably should.
There is some annoying syntax involved in starting a .h file. I almost always copy from an existing .h file. Be sure to get the #ifndef stuff at the beginning and the #endif stuff at the end.
We can organize all of the dependencies and potential executables in a Makefile and use the unix command make to compile and run them. Look at the Makefile in your hw3 directory. The following is two generic lines and then two example lines from the Makefile:
(Note: you must have a tab before the unix command on the second line. The need for tab was a bad design choice, but that is how it works.) When we type "make master" into the command line it checks that all of the dependencies "order.c, util.c, book.c, stock.c, venue.c and master.c" are available and then it runs the command on the second line,target: dependencyOne dependencyTwo unix command to run master: order.c util.c book.c stock.c venue.c master.c gcc order.c util.c book.c stock.c venue.c master.c -o master
gcc order.c util.c book.c stock.c venue.c master.c -o masterwhich compiles the code into an executable. So far this is just a complicated way of replacing
gcc order.c util.c book.c stock.c venue.c master.c -o masterwith
make master
So far this is convenient but not really worth all the trouble to set up. Make becomes important when the dependencies go beyond just source files. For example the following two lines also appear in the Makefile
t2: master ./master < data/t2.txt
When we type "make t2" it first makes master (because it is listed as a dependency) and then runs the command
./master < data/t2.txtAs projects grow in complexity handling this organization becomes important.
Trivial Task: Use make to perform a trivial unix command. For example change your Makefile so that typing "make list" executes the unix "ls" command. There should be no dependencies.
Substantial Task: Edit the Makefile to compile a hashTest program. This will be similar to testOrder and stockTest. Make this target and ensure that things work.