VFS

From Second Life Wiki
Jump to navigation Jump to search


There is the beginning of a discussion of the VFS in the Second Life forums:

SL Forums > Resident Forums > Feedback > Feature Suggestions > OpenClient: Cache operation discussion

The following is the beginnings of documenting the Virtual File System. This is by no means complete and should not be considered authoritative at this time.

VFS Functional Overview

VFS Data Structures and Layout

The virtual file system consists of two data structures. The first of these is the Index file, which maintains a catalog of all files and free space in the VFS, and the second is the Data file, which is the actual raw data that makes up each file.

Files are referenced in the Index by their GUID and resource type, and are stored as continuous blocks of bytes in the data file, recorded by starting position and data length in the Index.

The VFS also keeps track of free space in the Data file. Free space can be seen as a resource that is used up and subdivided as file space is allocated. Initially a new blank VFS consists of a single free block spanning from beginning to end of the entire Data file. When files are created, this single block is divided into two pieces: the new file allocation block and the new, smaller free block with a new start position and data length.

As files are deleted, the space is converted to a new free block and is noted so that it can be found and reused. If a free block happens to open up immediately trailing or leading another free block, the two blocks are merged into one single contiguous free block.

Creating new files in a full VFS

Data blocks are always contiguous. Data blocks can never become fragmented because there is no mechanism for chaining scattered free block clusters together. If a new file block is to be created and there is no block of free space available large enough to contain the new file, the VFS begins deleting the oldest cached data until a sufficiently large free block appears. This is done by creating a histogram of files in the cache and removing the Least Recently used data.

Note that if during this search a newly deleted block is too small to contain the new data and is not adjacent to any other free blocks that can be merged, that newly deleted block cannot be used. The VFS must continue to delete other old data blocks, until a block of equal or larger size becomes available for the new data, or a small block happens to open up next to or between other free blocks and their combined size is large enough to contain the new data block.

Without the ability to span data across separate free blocks, the VFS may need to delete many more old cache files than are really necessary to contain the new data

Pre-allocation of file blocks for new data

If the data size of an incoming asset is known, then a block large enough to contain all the data can be preallocated for it in the VFS. Then as data arrives it is written into this allocated block.

Data blocks that have been allocated can also dynamically increase or decrease in size. If the block is to become shorter, the trailing space is turned into a new free block. If the block is to grow in size and there is no immediately available trailing free space, the VFS searches for a new free block somewhere else that is large enough to contain the larger block. The old small data block is then moved to the new larger location and the original space it occupies is converted into a free block. If no larger free block is available, then the cleanup process of deleting the oldest data must be done to create an available large free block.

Preallocation of all space that incoming data may eventually use, is therefore a valuable tool to prevent unnecessary moving around of an existing data block.

Index file is for crash/backup purposes only

The Index file is not actually used when searching the VFS for free and used data blocks. Instead when the client starts up the Index is read only once, and is converted into maps (C++ data structure) and multimaps, and held in memory at all times. When the in-memory map/multimap is changed it is synced to the real Index file on the OS filesystem, so that if the client crashes unexpectedly the Index on the disk will match what is in the VFS Data file.

In this way the Index is always hot-cached in memory and is always using system memory for recording the position of files in the VFS Data file. As the VFS Data file becomes fragmented with small unused free blocks, the Index keeps growing in size.

It is therefore difficult to support a VFS spanning tens of gigabytes of disk space since the block map becomes increasingly larger and keeps consuming more and more system memory on the user's machine.

References