Structure

Next: Controlled Experiments Up: Implementation Previous: Implementation

Structure

Figure 6 shows the basic structure of the implementation, where each box stands for a C thread and the shaded area constitutes the prefetcher. The prefetcher consists of two pieces of code. The first is the system independent prefetch engine, which handles prefetch analysis, working tree construction and pattern tree maintenance. The same code was used in the simulation. In the implementation, the BSD service threads were modified to provide the prefetch engine information on each fork, execve, open, chdir and exit system call. The prefetch engine processes this information, makes prefetch decisions and enters the files to be prefetched in a queue. The second piece of prefetcher code is an added thread called the prefetch daemon. The prefetch daemon consumes the file queue and produces block read requests in another queue. These requests are then satisfied by the async daemons. Unlike a user-initiated read operation, a prefetch ends when the block is placed in the system buffer cache. No copy to user space is necessary.

Figure 6: Structure of Implementation. A prefetch daemon is added to the collection of C threads in UX42. Each BSD service thread is extended to call the prefetch engine when a relevant system call is serviced.

The prefetch daemon takes advantage of the NFS readahead logic by prefetching only the first block of a file. When a requested block number is one more than the number of the block last read (r_lastr), NFS readahead logic speculates that the file is being accessed sequentially, and initiates an asynchronous read on the next block together with the requested block. At least two requested blocks are needed to establish the sequential access pattern before NFS starts readahead. Accordingly, when it removes an entry from the file queue, the prefetch daemon generates a read request for only the first file block (block 0). It also sets r_lastr to -1. Later when block 0 is actually accessed, the request will hit in the cache, and, since r_lastr is -1, a readahead on block 1 will be issued. This scheme moves on as the file is read block by block, which is the norm. Since the prefetch daemon issues only one block read for each file to be prefetched, the cost of prefetching is minimal.

We have ensured that prefetch I/O yields to regular NFS asynchronous I/O. All the NFS async daemons can be used towards regular I/O, but only up to a certain number of them towards prefetch I/O. Prefetch I/O will be started only if this limit has not been reached and there is no pending regular I/O. Thus, regular asynchronous I/O's can be serviced promptly and when there are too many of these, the prefetcher will refrain from issuing any prefetch I/O's. This ensures that prefetching halts when the system capacity limit is reached, and does not add extra load to an overload.

The implementation consists of approximately 3380 lines of C code. Of these, 90 lines have been added to the existent UX42 source files, 2280 lines are in separate ``.c'' files and 1010 lines in ``.h'' files.

Next: Controlled Experiments Up: Implementation Previous: Implementation

An Analytical Approach to File Prefetching
Hui Lei and Dan Duchamp