BSD Memories
There was a time when the concept of computer science seemed a contradiction in terms and memory was firmly anchored inside human crania. Today, memory is one of the terms we surrender quite happily to our silicon avatars, even though life, intelligence and creativity are traits we're less happy to sacrifice. Yet, whenever a computer, database or network are asked to recall a particular file and present it to us, we are astonished that recall is less than instant.
The path which bits and bytes travel from memory circuits to the screen is complex. They vary from operating system to operating system, with huge differences between various UNICes, Linux and BSDs.
Indeed, among BSD flavours, the third generation of memory management systems was introduced four to five years ago and there is no area where the differences between the personas of the BSD trinity have been so pronounced.
The traces of the first generation of UNIX memory management have all been but wiped out in modern xBSD. Starting in 1985, various versions of Mach memory management were incorporated in BSD UNIX. The process had been completed by 1991.
4.4 BSD virtual memory management replaced the rather ugly VAX-based memory management routines that were becoming ever more irrelevant for BSD UNIX. The issue for BSD UNIX had been the lack of multiple processor support: different data structures were needed to protect critical sections and to make concurrent processing safe.
BSD virtual memory management relies on the vm space datastructure. The vm map structure deals with virtual address space, but only as far as it can be operated on in a machine-independent fashion. The vm pmap structure makes its data available to the machine-dependent memory management routines.
Each vm map entry covers contiguous virtual memory and points towards another virtual memory data structure, the memory object.
Every time memory objects are copied, they are linked in a list with unhealthy growth potential. If a routine looks for a particular page in memory using the linked list, search times might be rather long. In addition, a long memory object chain is likely to contain pages that are inaccessible, a state of affairs that cannot be changed easily. BSD VM specialists used a method to collapse object chains, to remedy 'swap memory leakage'.
But searching and finding inaccessible pages was only carried out after swap memory leakage had locked the system: not the most efficient way to improve a situation caused by the limitations of a data structure chosen by the original VM system designers.
There were other problems: the VM system suffered from the less than optimal integration carried out by BSD kernel programmers, and the VM system cannot easily share data with other kernel subsystems unless a copying operation is carried out to avoid possible conflicts. This problem is not limited to kernel subsystems, though, since kernel processes generally cannot safely share address space.
The way they do this is by invoking multiple copying operations at the expense of speed and efficiency.
The FreeBSD project managers decided to adopt a gradual approach to the problem and change the original Mach VM into something more suitable without sacrificing the continuity from 4.4BSD to more modern FreeBSD versions.
Being more experimental, the NetBSD project members decided not just to change, but replace the pertinent features of the 4.4BSD VM system to avoid the problems of object chaining altogether. New data copying mechanisms were also introduced. Unsurprisingly, the NetBSD documentation writers added a few man-pages explaining the basics of NetBSD memory management, which has software known as the UVM (the U stands for 'universal' and has lost all concomitant meaning).
Object structure
BSD VM memory object structure is very much a stand-alone construct, used and managed by the VM system alone. Page allocation and memory usage are a matter for the VM system to resolve.
The UVM memory object structure is normally embedded and controlled by another kernel subsystem. Pager functions provide the data channel from the kernel subsystem to the UVM unit by which the UVM is managing memory. This is a slightly more efficient and simpler approach to memory management, since it does not require the separate routines for the management of data sources and the UVM.
The UVM system is more intelligible.
It also means the NetBSD kernel subsystems have far more liberty and fine-grained control over the UVM than the BSD VM subsystem would permit.
The most important kernel subsystem is what is known as the vnode subsystem.
vnodes are well known among UNIX gurus, but if there is some kind of file copied or affected in any way by a user operation, vnodes are likely to be used to determine the location and state of the file concerned. They are essentially like very small containers of information telling the filesystem where the file is and which type of vnode has entered play.
vnodes are allocated only to files used by the VM or the buffer cache. They shouldn't be confused with inodes - these were used by traditional UNIX kernels as well, but refer to all files regardless of state of activity and contain a good deal more information than vnodes. Also, inodes are unique to the filesystem within which they are being used while vnodes are associated with files which the memory management system accesses in some way.
inodes can be converted into vnodes on some UNICes, but the reverse would not make sense, since vnode changes would not be written back to the inode.
One problem for pages within memory is that they might not be accessed often or at all. Now the BSD vnode cache subsystem just retains unused vnodes in the hope that they will be used again.
The big problem here is that the BSD VM system does the same with memory objects: if they are not being used, they will still be cached for a long time.
Now we have two systems dealing with vnodes and different memory allocation routines, since vnode structures are allocated whenever the usual file operations occur or the file is mmap-ed, whereas memory objects coming from the BSD VM system are created only when the file is mmap-ed.
If vnodes become scarce, then the kernel recycles the least recently used unreferenced vnode. In the same way, the BSD VM system caches unreferenced memory objects. While vnode structures are allocated when a file is opened, read, written or memory mapped, BSD VM vnode-based memory objects are allocated only when a file is memory mapped.
The BSD VM system is also limited to holding 100 unreferenced objects.
The VM system can hold active references to vnode structures in the external vnode subsystem, but owing to this limitation, unreferenced memory objects are being discarded using an LRU algorithm.
The two-layer object caching mechanism is fairly cumbersome, so the UVM system uses the vnode subsystem to manage unreferenced memory object allocation. The LRU algorithm is used to allocate and terminate both vnodes and memory objects. The 100 object limit does not exist for the UVM, so avoiding problems with inactive objects being flushed from memory. The memory object number limitation together with the two-layer object caching mechanism leads to some rather serious performance problems, which the UVM has overcome.
Changes for the bette
Of course, these are just pointers to the changed memory management design in NetBSD. It has been around since 1999 and set off corresponding changes in FreeBSD in the following years.
Quite a few more problems were resolved, like swap memory leaks under the BSD VM and the FreeBSD code that largely copied the BSD VM design.
These and other changes to the way the BSD VM manages memory led to radical changes in which the VM had been designed.
But the most important changes occurred with regard to performance. Under NetBSD, process overheads considerably diminished and applications ran considerably faster than FreeBSD-based ones, if the same architecture was used.
It would be unfair to claim the UVM had no predecessors: the SunOS VM pioneered the use of vnodes, and anonymous memory management techniques, not outlined here, were implemented by Sun as well.
It is more difficult to tell what kind of improvements were inspired by the UVM: it is claimed that John Dyson's and Matt Dillon's commits to the FreeBSD kernel tree were largely inspired by UVM, but they bear little resemblance - at least as far as it is possible it understand the present FreeBSD VM after the hiatus following Dyson's departure. OpenBSD also based its new VM rewrite on NetBSD's UVM.
Now we can see that all the BSD flavours, including Mac OSX, base their memory management systems on a mix of Mach code and home-grown redesigns. But given that they all experienced similar problems, their reaction to performance problems was by no means similar: basic parts of the original 4.4BSD kernel have been redesigned in ways that tend to change their performance characteristics as much as their much-vaunted differences in portability and security standards.
However, Mac OS X and OpenBSD could not look more different to each other. This also proves there is a fundamental difference to the Linux kernel: the Linux memory management unit will always remain identical for all implementations, since it is an avowed goal to prevent forks between various kernel versions.
The BSD flavours want to present the image of a functioning UNIX OS, but whenever problems are discovered, the programmers behind the kernels will ruthlessly re-implement parts of the kernel without taking code compatibility between the BSDs into account. These shouldn't be seen as "forks" from 4.4BSD, but since the dream of a re-unified BSD died with the capitulation of Windriver Systems in the face of the dotcom slump, it looks as if the various BSDs are moving further apart rather than coming together.
Yet, surprisingly, the UNIX world still possesses enough cohesion to let the different communities learn from each other. Indeed, the free and the proprietary BSDs still have enough roots in common and conformant standards to adhere to, that even the designs residing on random access memories are cheerfully studied and, if judged to represent an improvement, incorporated.
vnode: www.nux-acid.org/src/kldvnode/vnode.h
The UVM Virtual Memory System: http://ccrc.wustl.edu/pub/chuck/tech/uvm/
Chuck Cranor's home page:
www.research.att.com/~chuck/
FreeBSD: www.freebsd.org
OpenBSD: www.openbsd.org
NetBSD: www.netbsd.org

