Last modified: Mon Feb 26 11:42:39 CST 2001
Unlike the dispatching loop code, the Nachos code relating to exception handling exhibits rather straightforward flow of control. The difficult quality of the exception handling code is the way in which information is passed between user programs and the OS. In order to understand and program that information passing, you need to correlate declarations and definitions from several different parts of the code.
When user programs are running in their own address spaces on the simulated MIPS machine, passing control to the operating system is a bit tricky. In Nachos, control must pass from the simulated code to native UNIX code for the Nachos OS. With a real OS on a real machine, the same sort of structure is imposed because control must pass from a protected user address space running code in user mode to the global OS address space running code in privileged kernel mode. OS code is generally not even addressable by users, so they cannot execute normal function calls to OS functions.
So, in order to pass control to the OS, we use exceptions, which are essentially interrupts generated by the CPU itself. There are a number of different sorts of exceptions, mostly representing different sorts of execution errors. One special exception represents a system call. When an exception occurs, the instruction-interpretation hardware in the (simulated) CPU passes control to a location that is either predetermined, or specified by the OS, but in any case not accessible to user programs. In principle, there could be a different block of code for each sort of exception, but most systems, including Nachos, have a single block of code, which is responsible for discovering what sort of exception has occurred.
Look first at machine.h. There is a definition of an enumerated type, called ExceptionType, which provides a name for each sort of exception. Later, there is a declaration of a function ExceptionHandler, taking a single argument which is the exception type. The (simulated) hardware is responsible for calling ExceptionHandler whenever an exception occurs, passing the name of the particular sort of exception. The OS is responsible for defining ExceptionHandler to deal with each exception in an appropriate way. It is probably not realistic that the exception is identified by an argument to ExceptionHandler: a real system would probably have to find an identifier in some predetermined register.
Next, look at syscall.h. 11 constants are defined to give somewhat mnemonic names (SC_Halt, SC_Exit, etc.) to the codes for the 11 system calls recognized by Nachos. As in a real OS, when a system call exception occurs, the user program is responsible for storing the identifying code for the desired system call in a particular register (2 in Nachos). Since the set of system calls and their codes are part of the OS, in principle you are allowed to change them, but I don't recommend it. The remainder of syscall.h declares C++ functions that may be called by user programs to produce system call exceptions. These functions must be implemented in machine-dependent assembly code, which is already provided in start.s. The user-level system call code works, and we fortunately don't need to study it.
Now, look at exception.cc, to see the initial definition of ExceptionHandler. It implements only the system call Halt, by calling interrupt->Halt(). Notice that the Halt function in the Interrupt class is a completely different thing from the Halt function that is called by a user program, and it is exception.cc that connects them. It seems rather unrealistic to allow user programs to halt the whole OS, but that shouldn't inhibit our other work, so we'll leave it as is. Perhaps a later improvement of Nachos would refine the Halt system call so that it would only halt the OS when called by a superuser. Notice that the fact of being a superuser is not the same as executing in privileged mode, so the OS would be responsible for checking superuser status through some sort of login identification.
The only other work that the initial Nachos definition of ExceptionHandler accomplishes for you is to fetch the identifying number for the system call from register 2, by machine->ReadRegister(2). ReadRegister and WriteRegister are declared for you in machine.h and defined in machine.cc. You will need them to bring other parameters from user code into the OS kernel code.
The first thing for you to do is to replace the initial definition of ExceptionHandler by a pair of nested switch commands, providing a slot for the code for each different sort of exception and system call. At first, merely implement these so that they give a DEBUG output, and then either Halt, Finish, or go on with the user program, as appropriate for each case. This first step is not quite trivial, because before returning control to the user program you must increment the PC, else there will be an infinite loop. The PC is just a register, accessible to you through ReadRegister and WriteRegister. See the definitions of ``User program CPU state'' in machine.h for the register number of the PC. Each MIPS memory address, including the PC, points to a byte of memory, and each instruction is several bytes long, so ``increment'' the PC does not mean to add 1. Look in mipssim.cc to find the right number to add to the PC. Also, the use of delayed branching in the MIPS requires a separate NextPC register, and for debugging purposes Nachos maintains a PrevPC, both of which must be updated appropriately. See the Nachos Road Map section on experience with multiprogramming for the correct code to increment the PC.
Now, go through the system calls one at a time, and provide sensible implementations. I chose Exit first, then Write. Also, protect the OS as much as possible from errors in user programs. For the most part, this just means Finishing a user thread when it causes an exception other than a system call. But, there are also some loopholes in the way that Nachos starts up user programs: I noticed two of them in addrspace.cc, both of which are marked by ASSERT commands, but which need conditional code to abort the user program as well. There are almost sure to be more such loopholes. Fix as many as you can.