Com Sci 230
Operating Systems

Department of Computer Science

The University of Chicago

NACHOS Source Code

Last modified: Mon Feb 26 11:42:52 CST 2001

Guide to reading the NACHOS source

Running Programs in User Address Spaces

Overview of StartProcess

Look at StartProcess in progtest.cc. This key sequence of commands in StartProcess is

space = new AddrSpace(executable);
currentThread->space = space;

space->InitRegisters();
space->RestoreState();

machine->Run();

The first two commands create a new address space loaded with the program in the file executable, and associate that address space with currentThread.

The next two commands above initialize the simulated machine registers and the pointer to the page table or TLB. It seems a bit peculiar at first that the registers are initialized directly in the simulated machine by InitRegisters, but the address translation table is initialized in the data structure representing the address space by the creation of the AddrSpace, and then loaded into the simulated machine by RestoreState as if it had been saved from an earlier execution of the currentThread. The code would look more symmetric if it called currentThread->RestoreUserState() instead of space->InitRegisters, and counted on some earlier initialization step to associate the right register values with currentThread. The asymmetric organization is, in fact, the right one, since the currentThread existed before the creation of the new address space, and arranging for currentThread to run the program in executable in the new space is really a change in the nature of the already executing currentThread, rather than a part of the initialization of the address space itself. In all other cases in the current Nachos code, we see RestoreUserState and RestoreState side by side, because change of address space always happens along with change of thread. The creation of a new address space is the only place where we change address space without changing thread. But, an OS with more general facilities for controlling the interaction of threads and address spaces might use these functions more independently.

The last command in StartProcess is machine->Run(), which passes control to the MIPS simulation, with the simulated MIPS machine in the state that was created by InitRegisters and RestoreState. Run never returns: it gives control temporarily back to the Nachos kernel by raising interrupts and exceptions, and terminates completely when the thread that is executing it Finishes. From its name, you might take Run to be a function that should be called exactly once to execute the MIPS simulation. Because the simulated MIPS CPU is scheduled by native C++/UNIX code in the Nachos kernel, it is in fact appropriate for the kernel to call Run every time it wants to pass control back to the MIPS simulation. In a real system, the call to Run is just a return to unprivileged user mode from privileged kernel mode, followed by a branch to whatever user code the OS has decided to run.

At first, it seems that StartProcess will continue to be suitable for starting up user programs (e.g., in the implementation of the Exec system call), through all of the innovations in our projects. It appears that all of the changes will fit in the functions called by StartProcess. But, in order to invoke appropriate address translation when the user executable is being stored into the user address space, it is probably better to reorganize so that some of the work done by AddrSpace::AddrSpace in the initial Nachos code is lifted up to StartProcess.

Initializing the address space in AddrSpace::AddrSpace

Now, look inside the definition of AddrSpace::AddrSpace in addrspace.cc. The first portion of code does some necessary manipulation of the noff headers required on executable files in the MIPS simulation. Fortunately, we don't need to change that rather opaque code. The next portion computes the size of the address space in a rather obvious way. We probably will not need to change the size calculation, but notice that the UserStackSize is a constant defined in addrspace.h. You may need to increase UserStackSize depending on the complexity of your test programs.

The next portion of code initializes pageTable. This is what you have to change to implement the Exec syscall, since the initial code only provides one address space, and you must allocate a new address space for each user program. Use the bitmap functions described in bitmap.h: they are nicely designed for the needs of paged memory allocation. In Project #3 you will change this portion again to provide an initial TLB.

Finally, AddrSpace::AddrSpace initializes the simulated MIPS memory to contain the contents of the given executable file as its code, and 0s elsewhere. This code uses virtual addresses as if they were physical addresses to index directly into mainMemory. That trick only works because the only address space created by the initial Nachos code translates addresses by the identity map. You must change this code to apply address translation. Unfortunately, at this point in the code, although the page table in the representation of the address space (space->pageTable) has been initialized, it has not been given to the simulated MIPS machine (machine->pageTable has not been set), so ReadMem and WriteMem will not provide address translation for the newly created address space. You could set machine->pageTable in AddrSpace::AddrSpace, probably using a call to RestoreState, and then use WriteMem to store the code read from executable, but I think that it is better not to do anything to the machine simulation context as a part of initializing an address space.

I suggest that AddrSpace::AddrSpace already contains too much code. The allocation of an address space in the kernel should deal only with the kernel-level description of the address space, and not with its contents. I recommend that you remove all of the file reading and initialization of simulated MIPS memory entirely from this function, and do it instead as part of StartProcess. Instead of executable, let the parameter to AddrSpace::AddrSpace be the size of the address space. Make StartProcess responsible for reading in the noff header from the executable file, calculating the size, and calling AddrSpace::AddrSpace. StartProcess can load the code from the executable file into simulated MIPS memory after switching to the new address space with RestoreState, using WriteMem to write with the appropriate address translation. Perhaps the ideal implementation would start every user program off with the same code: a program to read in more code from the chosen executable and then branch to it, but with our tight time constraints I don't recommend that you go that far.

Wherever your code for storing the user program ends up, that is where you must also add code to store parameters to the Exec system call, unless you choose to implement a separate system call to fetch the parameters from kernel address space.

Initializing simulated MIPS registers

Next, look inside the definition of InitRegisters in addrspace.cc. I am not convinced that InitRegisters should be a member of the AddrSpace class, rather than an independent function at the level of StartProcess. The only reference to the controlling instance of AddrSpace in InitRegisters is the use of numPages in calculating the value to store in StackReg. Since InitRegisters is called only from StartProcess, and I don't anticipate any other need for it, it would also make sense to move its code directly into StartProcess. If you choose to store Exec system call parameters immediately into the user address space, here is where you will set register 4 to point to those parameters. The setting of StackReg here also gives you a hint how to allocate a new stack for a new thread within an existing address space, if you implement the Fork system call.

Deallocating address spaces

Since the initial Nachos code allocates only one address space, it needn't worry about deallocating. To support the reuse of physical memory by a sequence of user programs, you must provide sensible deallocation code in AddrSpace::~AddrSpace. Deallocation must include C++-level deallocation of the data structures describing the address space, and also deallocation of the simulated MIPS physical memory in the mainMemory array. Implement the MIPS-level deallocation by marking deallocated page frames as free in the page-allocation bitmap. In order to know when to deallocate an address space, you need to keep track of the threads that use a given address space, and notice when they all have Finished.

Quick index to relevant code

progtest.cc
- StartProcess
addrspace.cc
- AddrSpace::AddrSpace
- InitRegisters
- AddrSpace::~AddrSpace
addrspace.h
- UserStackSize
bitmap.h
- utilities for managing page-allocation bitmap

Com Sci 230 Operating Systems