Networks and Distributed Computing CS-233/333

Homework 4

    Due Thursday, May 15st, at the beginning of class.

Reading Assignment

    Read Chapter 5 Tanenbaum

Programming Exercise

    The point of this assignment is to improve the web server you created last week.

    Implement any 3 of the following:

    1) Add to your http server the capacity to browse directories. If the <Document Requested> in the request is a directory, your HTTP server should return an HTML document with hyperliks to the contents of the directory. Also, you should be able to recursively browse subdirectories contained in this directory. Check the man pages for opendir and readdir.

    2) You will add concurrency to the server by having the HTTP server fork a child process when a request arrives. The child process will process this request while the parent process will wait for another incoming request. You will also have to prevent the accumulation of zombie processes. You can base your implementation on the previous server example code.

    3) Implement cgi-bin . When a request like this one arrives:

    GET <sp> /cgi-bin/<script>?{<var>=<val>&}*{<var>=<val>}<sp> HTTP/1.0 <crlf>
    {<Other Header Information> <crlf>}*
    <crlf>

    the child process that is processing the request will call execv on the program in cgi-bin/<script>.

    There are two ways the variable-value pairs in {<var>=<val>&}*{<var>=<val>} are passed to the cgi-bin script: the GET method and the POST method. You will implement the GET method and for extra points you may implement the POST method.

    In the GET method the string of variables  {<var>=<val>&}*{<var>=<val>} is passed  to the <script> program as an environment variable QUERY_STRING. It is up to the <script> program to decode this string. Also if this string of variables exists, you should set the REQUEST_METHOD environment variable to "GET". The output of <script> will be sent back to the client.

    For more information on how cgi-bin works see The Common Gateway Interface

    4) Implement a server log. Each server access should be logged to a file. The logged data should include souce IP address, authenticated user name (if required for that connection by a .access file), request(path), date and time. For extra credit, log to syslog instead. Use the Common Log Format (CLF)

    5) Implement security. If a file named ".access" is contained in a web server directory, then prompt the web client for a password using the standard browser authentication. Verify this password against either a) the UNIX password file, or create your own password file. Feel free to use the apache or other tools, passwd and passwdd to create and maintain this password file. After you have verified the users password is correct for the account she claims, verify that this account is listed in the ".access" file in that directory of the web server. For extra credit, implement groups as well. Note that you should use the built-in browser security, which means you don't need to produce your own browser dialog.

    6) Reduce the concurrency latency by either pre-forking a pool of servers and hand requests off to the pre-forked pool, or use a multi-threading library to create a multi-threaded server which can handle multiple simultaneous requests. The server invocation command line should read a parameter which is the number of parallel threads or processes to support. (Unless you already have experience with the Linux thread support, you probably want to pick the pre-forked pool of servers option. Doing I/O in a thread library can be difficult.)

Deliverables

    Please turn in your source and makefile to the grader for this week.