Difference between revisions of "Talk:VFS"

From Second Life Wiki
Jump to navigation Jump to search
(marking unsigned comment)
Line 65: Line 65:
#map.find has a n-based time frame.
#map.find has a n-based time frame.
#*A cache with 20,000 assets has to do 20,000 comparisons when it wants to find out about the most recently added asset.
#*A cache with 20,000 assets has to do 20,000 comparisons when it wants to find out about the most recently added asset.
Solutions:
Solutions:
<ul><ol start=""  style="list-style-type:upper-alpha">
<ul><ol start=""  style="list-style-type:upper-alpha">
Line 80: Line 81:
*Extra resources required for maintaining two maps.
*Extra resources required for maintaining two maps.
</ol></ul>
</ol></ul>
I'm not sure where the confusion was, but this is just not true.
Map implements a binary tree on the key, based on operator< comparisons of the keys.
It's stored in sorted order, and the find algorithm has O(log(N)) complexity.
The free blocks list is also a map (a multimap) stored by length.
This is just not "low-hanging fruit".
  -- [[User:Q Linden|Q Linden]] 13:29, 19 June 2008 (PDT)


== Fragmentation ==
== Fragmentation ==

Revision as of 12:29, 19 June 2008

This is a talk page associated with the Open Source Portal. Changes to all pages like this one can be tracked by watching the "Related Changes" for "Category:Open Source Talk Page"
Please sign comments you leave here by putting four tildes (~~~~) at the end of your comment. For more guidelines, see Talk Page Guidelines

Related discussions:


Alternative implementation

Only desk-checked:


// N_LEVELS * N_CHARS should probably not be greater than 8, since there's a
// separator in the UUID there.
#define N_LEVELS 2
// UUID is hex, so 2 chars means maximum of 256 directories per level
#define N_CHARS 2

#ifdef I_KNOW_THIS_ITS_A_UNIX_SYSTEM
# define DIR_SEP '/'
#else
# define DIR_SEP '\\'
#endif

extern char *cache_dir;
extern int safe_make_directory_path(char *path);

FILE *fopen_texture(char *filename, char *mode)
{
   int l = strlen(cache_dir);
   int m = strlen(filename);
   char *new_filename = calloc(l + 1 + m + N_LEVELS * (N_CHARS + 1) + 1);
   char *s, *t;
   int i, j;
   FILE *fp;

   if (!new_filename) return NULL;

   s = filename;
   t = new_filename;
   strcpy(t, cache_dir);
   t += l;

   for (i = 0; *s && i < N_LEVELS; i++)
   {
     *t++ = DIR_SEP;
     for (j = 0; *s && j < N_CHARS; j++)
       *t++ = *s++;
   }
   *t = '\0';
   if (!safe_make_directory_path(new_filename)) return free(new_filename), NULL;
   *t++ = DIR_SEP;

   strcpy(t, filename);

   fp = fopen(new_filename, mode);
   free(new_filename);
   return fp;
}

VFS index: std::map (low hanging fruit)

I've been reading through the STL source and documentation. std::map by itself is the wrong solution for the index. Strife Onizuka 11:29, 27 March 2007 (PDT)

  1. It is only sorted explicitly (std::sort)
  2. map.find has a n-based time frame.
    • A cache with 20,000 assets has to do 20,000 comparisons when it wants to find out about the most recently added asset.

Solutions:

    1. Sort the map:
      1. Keep the map sorted by doing a binary search to determine where an asset should be inserted.
        • A cache with 20,000 assets would only have to do about 16 comparisons to add an asset
      2. Use std::lower_bound instead of map.find (map.lower_bound suffers from the same problem as map.find)
      Problems:
      • The bulk of the index needs to be written to disk when a single asset is added. mIndexHoles & LLVFS::sync can't be used when adding assets.
    2. Keep a duplicated sorted map (without the holes):
      1. All requests into the cache would go through the new map, the old map is used to handle positions and allow for the use of mIndexHoles.
      Problems:
      • Extra resources required for maintaining two maps.
I'm not sure where the confusion was, but this is just not true. 
Map implements a binary tree on the key, based on operator< comparisons of the keys.
It's stored in sorted order, and the find algorithm has O(log(N)) complexity. 
The free blocks list is also a map (a multimap) stored by length.
This is just not "low-hanging fruit". 
 -- Q Linden 13:29, 19 June 2008 (PDT)

Fragmentation

The free space in the VFS can become fragmented, I estimate about 4% would be the upper limit on the amount of unusable free space due to fragmentation. There are many small assets, and the current alg tries to fit assets into appropriately sized spaces. About 6% of assets are one block in size. Defragmenting the cache doesn't seem practical. Strife Onizuka 11:55, 27 March 2007 (PDT)

Creating new files in a full VFS - Remarks

It seems, that to create a new file in a full VFS, the VFS deletes all LRU files until a big enough chunk "appears" to write the file to. This deletes an unnecessary big number of files. Wouldn't it be better to:

- Use the old algorithm to find a big enough chunk.

- Only delete the files that are located in that chunk.

One remaining question is, whether to write the new file at the beginning or the end of the chunk (assuming that the chunk is larger than the file). As an optimization, one could make this decision based on which file (the one in front of the new one or the one behind) is more recently used. Putting it closer to the more recently used neighbor makes it more likely to later free a larger chunk: The less recently used neighbor is more likely to be deleted first and when deleted, will have the remaining free space from our initial chunk adjacent to it. Those two free chunks can then be merged. -- —The preceding unsigned comment was added by Dreamer Miquel 10:58, 31 August 2007 (PDT)