Talk:Texture Cache

From Second Life Wiki
Revision as of 10:15, 27 March 2007 by Rob Linden (talk | contribs) (Linking to Talk:VFS as related discussion)
Jump to navigation Jump to search
This is a talk page associated with the Open Source Portal. Changes to all pages like this one can be tracked by watching the "Related Changes" for "Category:Open Source Talk Page"
Please sign comments you leave here by putting four tildes (~~~~) at the end of your comment. For more guidelines, see Talk Page Guidelines

The discussion below is a continuation of a long conversation on the sldev list in March, 2007, which per the newly created SLDev guidelines was moved to this talk page.

Related discussions:

GIF

We can safely avoid the GIF format. It's 256 colors only. it use unisys patent LZW (the good old patent used in the good old GIF expired, but unisys repatent'd it because of some "improvments" in the algorithm.). Unisys suddenly claimed they wanted royalties to encode GIF file, and just for that we shouldn't use it. source : License Information on GIF and Other LZW-based Technologies -- kerunix Flan

The last of the patents expired in October 2006. GIF is no longer patent encumbered. But PNG is still better. Keep in mind by recompressing the image to cache it you're trading CPU usage for disk space. Compression is usually a lot slower than decompression. Just storing uncompressed might be overall faster. In the end, we need to assume nothing and benchmark, benchmark, benchmark. Seg Baphomet 22:45, 23 March 2007 (PDT)
The GIF format is not suitable. It only supports 256 colors per frame. It does not support semi-transparent pixels. PNG is a superior format. Has anyone considered the time required to re-encode images in the target format? Strife Onizuka 09:42, 24 March 2007 (PDT)

Databases

There appears to be support with APR for a portable interface to common databases, namely Berkely-DB and SQL. Dzonatas Sol 18:23, 23 March 2007 (PDT)

APR = Apache Portable Runtime ? If yes, i can't find any interface to common databases in the APR documentation. Kerunix Flan 03:48, 25 March 2007 (PDT)
Yes, I noticed some features added to the 1.3 version that is in development. Dzonatas Sol 13:00, 25 March 2007 (PDT)

Stats

It would be useful to get stats on current and new hardware profiles from LL to get a sense of what people are using and where the focus should lie. Iron Perth 03:00, 23 March 2007 (PDT)

I plugged a simple cpu clock tick timer class (based on http://www.codeguru.com/forum/showthread.php?s=&threadid=280008) into FL-1.13.3.59558 VC Express, XP pro build lltexturecache, and recorded some times for index lookup & texture loading in LLTextureCacheWorker::doRead(). Spec: budget Sempron 2800+ on Gigabyte k8vt800, FX5200 128mb graphics, 1gb RAM, 160gb HDD (inexpensive + no RAID, not about to pull PC apart to get further HDD details) 2x40gb partitions, cache on partition #1, 54% full, in need of a defrag (currently 33% file fragmentation!). After running the download FL, the cache got purged (any way of re-building the index instead?).

(while re-building a half-decent cache, ReleaseNoOpt build executed from within VC
after several mins running time & teleporting to around 7-8 regions)
 WARNING: TickTimer::StopWithReporting: Lookup ticker.
 WARNING: TickTimer::StopWithReporting: COUNT:       10200
 WARNING: TickTimer::StopWithReporting: TOTAL (sec): 6.06264
 WARNING: TickTimer::StopWithReporting: AVG (sec):   0.000594377
 WARNING: TickTimer::StopWithReporting: MAX (sec):   0.147985
 WARNING: TickTimer::StopWithReporting: File read ticker.
 WARNING: TickTimer::StopWithReporting: COUNT:       3700
 WARNING: TickTimer::StopWithReporting: TOTAL (sec): 12.8276
 WARNING: TickTimer::StopWithReporting: AVG (sec):   0.00346692
 WARNING: TickTimer::StopWithReporting: MAX (sec):   0.126929
 nb. times may be affected by cpu speed varience (callobrated only once, and only over 1sec)

The other day, I was getting a better 0.001s average on the cached file read. Paula Innis 10:53, 25 March 2007 (PDT) (feel free to tidy this up or whatever)

TGA

I can't find it in the spec, but i think this format give the warranty that images of a given size will always have the same file size. (without RLE of course). This may help for some optimization.

Additionaly, the "developper area" of the TGA file format could be used to store some usefull information.—The preceding unsigned comment was added by kerunix Flan

Converting images into the TGA format for cache storage has the advantage that the client already supports rendering TGA images. It would be one of the easiest solutions. Images with the same dimensions would all be the same but otherwise would be different in size. Strife Onizuka 02:22, 24 March 2007 (PDT)


Faster Formats

Contribution from Tofu Linden from the email discussion

RLE is cheap but also not very effective at all for many, many textures; wherever I might have used RLE in ye olden days I'd now use trusty old zlib or blingy liblzf, the latter of which is more or less on the same order of speed as RLE but a LOT more effective on data (images) with redundancy not necessarily manifesting as runs of pixels. Iron Perth 02:30, 24 March 2007 (PDT)

And as said on the mailling, the destructive effect of JPEG2000 probably killed any chance to have a good compression rate using RLE. RLE is fast and the only compression supported by the Targavision specification. If we don't use it (probably because it's useless) then don't use compression at all. One of the advantage of using RLE like said in the specification is that we don't have to decompress anything to read the header, developer area, and footer. Kerunix Flan 08:02, 24 March 2007 (PDT)
I think I missed the bit about "the destructive effect of JPEG2000" on the list. Currently lossy compression is only used when the source is a JPEG and lossless for BMP and TGA. The point of using a format like JPEG2000 is that it achieves high compression ratios superior to other formats for the same image quality. For modern codecs to achieves high compression ratios they sacrifice CPU time. The reason for the push for higher ratios is to reduce the amount of storage and bandwidth required for transmission. If SL moves away from using JPEG2000 and towards a less CPU intensive, lower compression ratio codec, the asset servers will be serving more data per asset request. This seems like a step backwards and not forwards. As an intermediate format to save CPU time to avert the costly re-decoding of an image, it makes sense to pick a faster format though it will be costly in diskspace; and as long as the disk is not horibly fragmented it has the potential to be faster then decoding. A format needs to fit these criteria to be considered for this task:
  1. Lossless
  2. Support a variable number of channels.
  3. Fast to encode, fast to decode
  4. Preferably GPL compatible.
This excludes: GIF & JPEG. This may also exclude PNG because most implementations treat anything that is fully transparent as rgba(0,0,0,0); white images develop black outlines at low resolutions (not lossless). Strife Onizuka 21:08, 24 March 2007 (PDT)
We talked about at least 2 kind of local cache :
  1. Compressed texture cache : Obviously, it should be just a storage for the jpeg2000 sent by the sim. Why recoding ? (and lose quality ?)
  2. Uncompressed texture cache : storing some texture already decoded, so the client don't have to lose time decoding it when needed. For me, the best format is TGA. Kerunix Flan 03:31, 25 March 2007 (PDT)
zlib ontop of TGA works for me. Speed and ease of decoding. Strife Onizuka 18:24, 25 March 2007 (PDT)

http://en.wikipedia.org/wiki/Huffman_coding may be worth a look. It may work well with 'jpeg artifacts' Paula Innis 13:03, 25 March 2007 (PDT)

Take a look at http://en.wikipedia.org/wiki/Huffman_coding#Applications. See DEFLATE there? That's what .zip uses. You can get it with zlib. Dale Glass 13:19, 25 March 2007 (PDT)
And as http://www.zlib.net says; 'Note that zlib is an integral part of libpng'. There is also .DXT to consider. Paula Innis 14:21, 25 March 2007 (PDT)
Right. I said that because I don't get why you're mentioning Huffman then. It's been implicitly brought up by Tofu ("I'd now use trusty old zlib") and Seg Baphomet and Strife Onizuka (who mentioned PNG) before you added that. DXT, if you refer to S3 Texture Compression, is lossy, and patented, which almost certainly makes it unacceptable.
Excuse me, Strife, but are you putting Paula's name on something I wrote? I see I forgot to sign again though, grr. Dale Glass 18:43, 25 March 2007 (PDT)
Sorry about that I really thought I had gotten the right name on it. Wasn't on purpose x_x. Strife Onizuka 20:14, 25 March 2007 (PDT)

Umm, I guess I did miss Tofu's mention of zlib in amongst all that RLE stuff (which triggered my suggestion of Huffman trees). Not awake enough to register it when I read it, I guess. Paula Innis 07:30, 26 March 2007 (PDT)

Current Caching

Both Tofu and Steve have made the point they feel cache hit rate issues are a priority..

Contribution from Steve Linden from email discussion:

2. The way we calculate texture pixel coverage (and therefore desired discard level) is as follows:

a. When anything using a texture is rendered. set its pixel coverage to MAX(current pixel coverage, rendered pixel coverage)

b. Once a frame, reduce the current pixel coverage of all textures by 10%.

This way, everything rendered recently has an accurate pixel coverage, and everything not rendered degrades over time. gImageList.mForceResetTextureStats merely tells the code "assume that all textures are no longer visible" and resets the pixel area of all texture to 0. As soon as they are rendered their pixel coverage is updated.

3. Even if a textures pixel coverage gets set to 0, we do not immediately discard the texture data. We only discard the data if we need room for more textures (in which case we discard the textures with the lowest priority), or if no object in view is using the texture *and* 30 seconds have elapsed.

do that mean that, even if we have a lot of VRAM available, the texture will be removed from cache if we don't see the texture for 30s ? and then, moving our avatar in the sim will make the client redecoding the texture from cache to load it in VRAM again ? Why not discarding the texture only when we're out of memory ? (please delete/replace this comment by the answer) Kerunix Flan 08:18, 24 March 2007 (PDT)

4. Tateru is absolutely correct in that caching textures to disk mostly just saves on network bandwidth. Unless you have a slow network connection, it can be almost as fast to load textures across the network as from a disk, especially if that disk is not especially fast or is fragmented. Decoding the textures takes quite a bit longer. However, if you have multiple CPUs and enable Client > Rendering > Run Multiple Threads, the decoding will be done on the second CPU and go much faster.

I can't believe loading a texture from a remote cache isn't much slower than loading a texture from a local cache. And if you add the usual packet loss and high ping (for non-us user) it's *certainly* much slower. Kerunix Flan 08:18, 24 March 2007 (PDT)

He also makes some interesting comments about teleporting around and filling up the texture cache. Check out his original email here:

https://lists.secondlife.com/pipermail/sldev/2007-March/001133.html

Iron Perth 02:44, 24 March 2007 (PDT)


Home Cache

If a resident has a home, why not have a separate texture cache for it, a Home Cache. This cache would only be explicitly cleared down and would be accessed first by the client when the resident logs on or teleports home.

Once populated with textures, each logon/teleport to home would not involve network traffic, as they would be available locally, on the client.

Benja Kepler 02:52, 24 March 2007 (PDT)

A similar idea would be to have a hit count on textures. The hit count can be kept around much longer than the texture data. The hit count would be used to indicate what textures to keep in the cache longer. Dzonatas Sol 12:53, 24 March 2007 (PDT)
We probably don't want to invent our own caching system, since this type of thing gets studied a lot. Looking around, at what other projects are doing (e.g. PostgreSQL, Linux), I found this replacement for the LRU scheme used today: 2Q cache system. This balances hit frequency and hit recency to determine whether to eject something from the cache. -- Rob Linden 23:34, 24 March 2007 (PDT)
I agree. Trying to outsmart all the hundreds of people developing new cache replacement algorithms would be stupid. There's plenty of things out there better than LRU without rolling our own. Gigs Taggart 14:34, 25 March 2007 (PDT)

Textures in Inventory

A search through the avatar's inventory is a good indication of what textures to keep around longer in the cache than others. Dzonatas Sol 12:50, 24 March 2007 (PDT)

I can't agree with this. I have many textures that I rarely see in-world. I don't think it merits extra attention. Strife Onizuka 22:59, 24 March 2007 (PDT)

Plan: Normalization of Texture Usage

A method to keep the statistics of pixel coverage and hit rates: We use a collection of hit counters such that each represents an individual entry for an associated texture. When any hit counter hits a threshold, the collection is normalized and the results are stored in each associated hit-rate fields of the entry. The hit-rate fields may store percentages for short term and long term storage. The short term and long term storage indicators present a value for determination in how textures are archived or disposed. Pixel-coverage rates are optional fields in each entry for determination of an optimal storage medium for texture data such that the medium is represented in uncompressed or compressed methods and if the stored texture is downsampled and to what degree it is downsampled. Dzonatas Sol 17:05, 24 March 2007 (PDT)

I think this would result in decreased performance as a lot of time would be spent incrementing hit counters. Strife Onizuka 22:57, 24 March 2007 (PDT)
How hit counters are incremented have not been analyzed for implementation. This is just a general plan independent of any specific implementation. Dzonatas Sol 23:03, 24 March 2007 (PDT)

Idea: Keep a cache index

Keep an on-disk index of cache files. This could be made compact and small enough to fully into RAM. The index would allow to know whether a file is in the cache, thus avoiding trying to open a file if the current way of storing a texture in its own file is kept.

Additionally, store some metadata with: filesize, md5sum, and a very rough texture approximation (perhaps even limited to a solid color) to have something better than the current grey.

Filesize and md5sum would allow to check the on-disk files' integrity, which should avoid any need to clear the cache, as correctness could be automatically verified.

My suggested format for the index is Berkeley DB, which is simple and compact, and is ACID compliant, which should ensure the integrity of the index. Dale Glass 15:39, 24 March 2007 (PDT)

The old VFS used an index, while it didn't keep a hash of the asset it did track access date. If you are going to be using a DB on top of the cache, you might as well put the entire cache in a single file and save on file system IO. Why? After all a VFS is just a database of sorts and regardless you have to search the DB or the FS for the file. There were two problems with the old VFS: slow to find data in the index and slow to decode images (this isn't the fault of the VFS). A well designed VFS should be able to out perform any FS (because it doesn't have to perform locks). Strife Onizuka 19:12, 24 March 2007 (PDT)
The filesystem is already an entirely suitable way of keeping data like textures. It does have a few problems for example, such as that searching for a file in a directory may be relatively expensive. This can be easily compensated for with a small DB used only to compensate for those shortcomings. A full VFS-like thing is a lot more complex, you basically have to write your own mini-filesystem, and as we've seen, that's hard to do well. Berkeley DB is especially nice in that it's ACID - you have a guarantee of that data in it is consistent, unlike with the VFS (unless you go and code that in, but that takes time). Also, by keeping a collection of small items of data of a fixed size you avoid internal fragmentation, which is something you'd have to deal with a VFS-like approach. If you store textures as files, just use the system defragmenter. Dale Glass 19:43, 24 March 2007 (PDT)
  • How would having a Berkeley DB solve the problem of searches being slow? Would it be faster then a std::map.find()? How well does it's search alg scale?
The old VFS did have a bit of a problem with free space fragmentation. Writing a quick and dirty defragmentation utility for it is not a difficult task (at one point i wrote a utility for converting the cache into a tar archive, the output couldn't have fragmented free space). Remember, we already have a VFS, we don't have an integrated Berkeley DB. Strife Onizuka 22:20, 24 March 2007 (PDT)
It would be for on-disk storage, of course. Why invent a new format when a decent one already exists? The contents of it should be small enough to load fully into RAM and map.find() all you want, but at some point it needs to be written to disk. Plus, the VFS was quite awful, which is why it was removed. I don't see any need to reinvent the wheel when you could use an existing library much more easily. Also, assuming you would want to defragment the VFS, when exactly would you do that? On startup, on a 1GB file? Dale Glass 09:54, 25 March 2007 (PDT)
As I understand it, there is a saved copy of the memory based cache index (generally of std::map type). However, when corrupted, it is not re-built, and a full purge is performed (cahed file integrity is based on file size on disk==file size in index, I believe). Also, when the cahe is full, the removal of files is done on a sequential pass through the index removing files untill there is room for a new one (if memory serves me), without care or regard for frequency of use, and last time used (additions which will slow it down). Paula Innis 11:43, 25 March 2007 (PDT)
I've never seen the old VFS perform a purge. In the past I've seen it accumulate assets for months at a time.
You keep referring to not wanting to invent a new format, the VFS has already been written; the deed has been done; no inventing to be done, just maintenance. Creating a new format is something to always be avoided. The reason the VFS is slow is that map.find doesn't scale. It iterates sequentially through the list. As new items are added to the end of the list the VFS gets slower (because those new items need to be found to be accessed). The methods of searching and indexing the cache are important design features for the cache. Does Berkeley DB do a better job of indexing and searching?
Anyway Berkeley DB is a no-go, LL can't meet the requirements of the license at this time. Strife Onizuka 18:15, 25 March 2007 (PDT)
The maintenance can amount to rewriting it, though, and I consider it's much easier to plug in something that's been around since 1979, and thus incredibly well polished by now than maintaining a new solution. BDB has a very good performance, as it's basically a hash on disk.
But hrm, I wasn't aware of that Oracle got their paws on it. The concept has been around in multiple forms though, so I guess I'll have to check whether any variation with acceptable licensing still exists. Dale Glass 18:49, 25 March 2007 (PDT)

Idea: Keep a texture map for every sim

Currently, dual core CPUs have a large amount of spare time on the second core while they're not actively fetching textures. I propose a way of using this spare time.

The viewer would create a database per visited sim with a list of textures found in each location. This database would allow the viewer to know what textures can be found in nearby areas, even before the grid has transmitted the data about them. Then, when idle, the viewer could use the spare time to start loading textures from the surrounding areas, so that they are instantly available when the avatar moves.

My suggested way of doing that is using a Berkeley DB database to store the data. A simple way of storing the data would be making the key the position inside the sim, rounded to for example multiples of 16 meters. The data stored for that key would be a list of textures used in that area.

For example, the position (120,90,56) when rounded to multiples to 16m translates to (112, 80, 48). Under key "112,80,48" the viewer would store a list of textures used inside that 16x16x16 area.

This should make several optimizations possible:

  • When teleporting to a new area, decoding of textures can be started before the grid had time to transmit the required information. It should be possible to start loading textures immediately after the user starts the teleport, while the progress bar is still being shown.
  • When the avatar moves around, the viewer could start decoding textures likely to be present in locations about which there's no data available yet. This might make flying less painful.
  • If a storage method like the VFS, which stores multiple textures in a file is chosen, then having this data available could make it possible to optimize the placement of the texture data on disk, for optimal loading. Dale Glass 15:39, 24 March 2007 (PDT)

Stupid question: Why not just have the sim send the object definitions too? You can glean UUIDs for the images from the objects. Then you don't have to ask the sim for the objects too. Wait but isn't that just the same as looking in all directions at the same time (but only rendering what is infront of the camera). The great part about just asking for the object definitions is that there already is code inplace for caching those definitions. No need to hack in an entirely new DB. Silly question: Is this an efficient way to spend sim CPU time: Serving assets not in view and in directions other then that the user is pointing? Downloading textures while the client is in TP is a bad idea; what if the user doesn't have access to the sim (it takes some time for TP's to fail)? It would be a bad thing for say SAP to try to TP into Oracle's private sim (knowing full well that the TP would fail) in an attempt to glean insider information via texture pre-caching (linky). Strife Onizuka 19:34, 24 March 2007 (PDT)
Sorry, I don't think I succeeded at getting the idea across. It wouldn't involve requesting anything extra from the sim. The idea is that the viewer would compile a database of texture usage inside sims you visited, so when you started teleporting to a sim, it could look in its database and start preloading images from its cache that are likely to be found in the place you're teleporting to. Basically, instead of waiting for the sim to tell you what's there and then loading the images, it would load from the cache things that are likely to be there (as sims don't change all that often), and when the sim does finally give you the data, you'd already have a part of the images decoded and ready to show. If you visited a place in the past, I think there should be a very good chance of mostly correctly guessing the textures needed to render it.
SAP problem wouldn't happen because the mechanism would work based on information you previously got, so you'd have to have been to the place before for it to work. Dale Glass 19:56, 24 March 2007 (PDT)
Ahhh its' a passive solution. What you want could be archived without a special DB; the client already collects this information in the sim-object cache. For TPing, it could be done quite easily. The viewer just needs to load the data from the object cache and render it. This would cause the appropriate functions to be called to decode and request missing assets. It doesn't need to render it to the screen, just an extra buffer.
Doing a fake render is probably the easiest, most efficient and most accurate way of doing this. A major issue with building a separate DB is you have to build the dynamic octrees for it. By using the rendering pipeline with the object cache you don't have to write new code to do this. Strife Onizuka 22:51, 24 March 2007 (PDT)
Why go through all the trouble of actually rendering something (you can see the impact of extra rendering by enabling reflections) when textures is the only thing that is wanted? Plus, like I said I don't want to request anything missing. I want to do a sort of predictive loading of textures from the cache. I took a look at the object cache, but assuming I found the right thing, it doesn't seem to store texture usage.
Here's basically the behavior I want: I visit say, Lusk pretty often. The usage of textures in a sim can be easily mapped. The viewer knows I'm teleporting to Lusk (200,120,52). It can look in the sim's DB, and start loading the tree, ground, platform, etc textures (again, from the on-disk cache) before it even manages to connect to the sim itself. Then by the time the area loads, most textures may be already loaded, so I will avoid seeing it all grey. There's no need to render anything (what for?), nor there is any need to request textures from the grid (as they might not be used anymore).
Also I don't get where are you getting the octree thing from, as my idea doesn't involve any rendering at all, or even storing any geometry data. Dale Glass 10:06, 25 March 2007 (PDT)
pfft, I've been digging through the object cache code. I could have sworn they were keeping face information in there as well. Been a long time since I've played with the object cache (the last time I didn't have source to read and the format has changed a few times since then). *rereads your suggestions* On the one hand using a dedicated DB seems a bit overkill, but on the other I'm having problems finding a simple solution that would do the same thing. I don't see a point in me standing in the way of this anymore.
When would this information be put into the database? Strife Onizuka 17:00, 25 March 2007 (PDT)
The whole thing should be small enough to keep in memory, so it could be kept there. Then write the DB to disk when the avatar moves to another sim. Dale Glass 17:33, 25 March 2007 (PDT)
That makes sense. How and what information do you gather to put into the database? A simple list of UUIDs doesn't indicate which textures are more important then others. Say there are 100 textures in a 16x cube, which one do you decode first? If you sort the list by usage count, when do you collect this information and how often do you update it? Strife Onizuka 20:27, 25 March 2007 (PDT)

ARC vs. 2Q

Gigs added to the main page that ARC is "probably better than 2Q in every way". However, the PostgreSQL folks replaced ARC with 2Q over patent concerns, and then, as a bonus, found that 2Q performed as well, and maybe better. See the comment from Neil Conway for more on the subject. -- Rob Linden 12:48, 25 March 2007 (PDT)

If it's patent encumbered, that's a good enough reason to not use it for me. Looks like IBM will grant a free license for open source for the patent, but it wouldn't work for LL because IBM wouldn't let the same license work for proprietary releases. (LL would have to buy a separate license). Gigs Taggart 14:42, 25 March 2007 (PDT)