Culling

From Second Life Wiki
Revision as of 05:38, 23 July 2008 by Tofu Linden (talk | contribs)
Jump to navigation Jump to search

This information was provided to the SLDev mailing list by Dave Parks, Jake Simpson, and Douglas Soo of Linden Lab. This information should not be construed as official Linden Lab policy, and is rather documentation derived from informal discussions.

Related Bugs/Improvements

http://jira.secondlife.com/browse/SVC-1926


Overview

The client sends camera frustum information to the server, and the server sends the client information about objects that are "interesting" based on that information. Internally, we refer to this logic as the "interest list," and it's unfortunately implemented entirely server side. The gist of our streaming is "stream objects, ask for textures." Since the network load of a prim is much less than a texture, occlusion culling saves bandwidth by never requesting texture data for occluded objects. Unfortunately, since the interest list is based solely on size vs. distance for sending prims, occluders are often not sent first, so every texture is visible for at least one frame. It's usually OK, because the requested amount of a texture to download atrophies quickly when the texture is not being rendered, lowering its priority as soon as it is occluded. There may be optimization opportunities in the texture download prioritizing code.

Priority

Textures are prioritized based on how many pixels they cover on the screen, and the priority is "boosted" based on different flags defined in this enum (from llviewerimage.h):

  enum
  {
      BOOST_NONE              = 0,
      BOOST_AVATAR_BAKED      = 1,
      BOOST_AVATAR            = 2,
      BOOST_CLOUDS            = 3,
      BOOST_SCULPTED          = 4,
      BOOST_HIGH              = 10,
      BOOST_TERRAIN           = 11, // has to be high priority for minimap / low detail
      BOOST_SELECTED          = 12,
      BOOST_HUD               = 13,
      BOOST_AVATAR_BAKED_SELF = 14,
      BOOST_UI                = 15,
      BOOST_PREVIEW           = 16,
      BOOST_MAP               = 17,
      BOOST_MAP_LAYER         = 18,
      BOOST_AVATAR_SELF       = 19, // needed for baking avatar
      BOOST_MAX_LEVEL
  };

So, sculpt textures are prioritized before avatar skins, clouds, and all "normal" textures. If you alt-zoom or left click on something, its texture becomes high priority. Right clicking on it (selection) gives it a higher priority than the terrain. You get the idea.

Looking at the code, it looks like the sculpt textures might have a 5 second delay between the texture being available and it actually being applied to the prim, since mSculptChanged is the flag used to determine when a sculpt gets rebuilt, and it only gets set in LLVOVolume::updateTextures, which only gets called every 5 seconds or on LOD switches or on vertex buffer rebuilds, whichever comes first. Room for improvement there.

As far as front-to-back rendering goes to reduce overdraw, work has been done to this end in windlight, but has more to do with changing the order of specific types of geometry rather than reordering prims. In windlight, terrain is rendered after prims, sky is rendered last, etc. Occlusion culling was also rewritten in windlight to use fewer triangles and fewer "guess" queries, getting rid of the "guess" queue and making sure that if something is not visible, it's never rendered.

The client occlusion culling is ignorant of things like prim types since it happens at the OpenGL level working on meshes using GL_ARB_occlusion_query.

Performance

The biggest issue with occlusion culling performance is that very few builds have any significant occluders due to overuse of transparencies and walls that leak (have cracks between prims). I'd encourage any builders to take a good look at how professional level designers layout environments in video games. In "World of Warcraft," for example, there are no windows, and you can never see the complex items inside a building from the outside. In "Call of Duty 4," there are plenty of windows, but you can never see into a window on one side of the building and out the window on the other side, effectively making every building an occluder. Compare this to Second Life, where every building has floor to ceiling windows and no inside walls. Some simple tricks to keep your windows and not kill performance are to make your windows opaque from the outside. This is the perception people have of windows in real life, so doing this will often make a build look "cleaner" or less cluttered.

Some builds that play really well with occlusion culling are the cyberpunk city in Suffugium and the giant cube maze in Q. Let me know if you know of others.

The other major limiting factor for performance in SL is the small batch size. On average, SL draws about 70 triangles per drawing call. In theory, increasing that number to 200 triangles per drawing call could improve rendering performance by 200%. The primary reason the draw size is so small is because we have to break batches to switch textures and to sort transparent objects to be rendered in depth order. This is why people in the graphics world are making such a fuss over "megatexturing" and similar techniques. These techniques combine all visible textures into a single large texture so all your geometry can be rendered with a single draw call. These techniques require artist intervention from the get go, however, so most are not applicable to SL. The technique that is applicable is good ol' fashioned texture atlasing, which is merely combining several textures into a single texture and modifying object texture coordinates to reference the part of that larger texture that contains the texture the object was originally referencing. Since textures in SL are streaming, though, generating atlases for your builds will result in ugly artifacts as textures load, so the correct solution is probably something between megatexturing and atlasing, where atlases are generated on-the-fly by the viewer and tailored to align texture borders within the atlas along mip borders according to the amount of the texture that's been downloaded so far.

We've (somewhat intentionally) done a very poor job at Linden Lab of educating builders on how to be performance conscious, which I think plays a large part in the overall sluggishness of the viewer. This is part of a philosophy that says it should be possible to build a content authoring system where artists don't need to care about performance implications, as the software should just deal with whatever comes its way appropriately. What we've seen, however, is that people (even technically minded people) consume as many resources as are available to them until performance reaches a level they deem is unacceptable, so the more efficient your renderer becomes, the more ludicrous the demands become. In SL, this usually means the performance becomes that of the lowest common denominator in terms of what is acceptable. That is, if you think performance is important and build a nice house with opaque window exteriors and few textures, you'll still have a bad experience if your neighbor thinks a flexi-prim bamboo forest is the bees knees.

The free market is winning out here (Otherland sims and similar, managed estates do better than private estates with no building standards), but it's slow going, and an initiative to document the best ways to build beautifully and efficiently is probably in order.

Subscribe and Unsubscribe

The ObjectKill message is a message that is sent when your agent on the simulator has decided to unsubscribe to an object on the simulator. The primary reason why it needs to be sent is that once your agent on the simulator isn't paying attention to objects, it won't send further updates to the viewer, which will cause lots of interesting artifacts in the case of objects that change when you aren't subscribed. In particular, moving objects can cause quite a headache (ghosting, etc.) - so the correct behavior is to cull the object from the viewer completely.

The logic that decides to subscribe and unsubscribe to objects is done based on groupings of objects based on type, size, and location in the region. Right now subscription and unsubscription is done based on a circular keyhole layout, with a bit of hysteresis to reduce subscription thrashing. If you turn down your bandwidth and teleport to a new region (or clear your caches), you should see this effect pretty well.

Moving the camera shouldn't generally cause objects to be unsubscribed - camera position and direction is used for determining which objects to subscribe to first, but for the purposes of unsubscription, I believe that only your avatar position is used.

As you move around in a simulator, the sim is constantly sending ObjectUpdate* packets (to subscribe) for objects that become within range or have moved, and it sends ObjectKill packets (to unsubscribe) for things that go outside of your vision. Then when you move back to that area it sends the ObjectUpdate* packets for those objects again. You may wonder why culling is done across the network by sending ObjectKill packets for things that are not actually leaving the simulator.

Suppose the sim didn't do that. In that case, the viewer would never know when the sim decided the object is too far away. So you'd need to have the viewer perform its own culling, or you'd have extra objects hanging around.

If the culling on the viewer and sim didn't match, you'd end up with updates being sent for objects the viewer won't show, or the viewer showing objects for which the sim isn't sending updates.

And since the sim determines which objects are near enough to notify you about their existence, it's only logical that it'd do the kill notification as well. Less error prone, and avoids duplicating code. Sim needs to know which objects are far enough anyway, as it has to decide when to stop sending updates.

In terms of why objects are subscribed and unsubscribed in the first place, there's also the factor of network bandwidth to consider alongside the issue of CPU usage on the Sim. If you don't remove those objects from the stream that are not considered within the clients perview (and not just frustum perview as most FPS games have it - you need to be able to consider objects behind you to ensure you get sounds, particles that overlap your view and so on) then it doesn't take long before your stream is *full* of objects you can't see, don't care about and yet are getting updates for you don't need.

Moving some of this to the client is a definite possibility - however there are implications to that. It enables people to ask the sim for information about objects they can't see in an attempt to cheat at games, and generally see stuff they don't have the permissions to see, plus timing suddenly becomes that much more important.

Realistically the Sim has to be the final arbiter of what it sends for several reasons - network bandwidth (you may have extra bandwidth to receive more data, but the sim doesn't necessarily have the bandwidth to be sending it out, given it's sending out these streams of updates to gobs of clients, not just yours), clients seeing what they need to see and no more, and the sim owning what you can see.

Possible future improvements

Note - this isn't necessarily how SL works right now, just an ideal of what it might be nice to get to.

The subscribe / unsubscribe code in SL needs some attention. What's there works, but not in the most optimal fashion.

Right now unsubscribe / subscribe is UDP and will probably need to remain that way since UDP is asynchronous and TCP is not - if object updates are coming via UDP (and they need to, trust me on this. Go dig out the TCP version of Doom and see what that does :) ) then the "Client, you need to know about this object" needs to be UDP to be sure it gets there first - at least attempting to, since UDP isn't guaranteed delivery.

Generally there are approaches that will work for this kind of thing - if the client receives an update for an object it doesn't know about yet, it can just ignore that update, unless it's enough to actually construct a complete object on the client side in which case it should do that. Generally though you don't get enough in an update packet since it's mostly deltas from an known state, not a complete state for an object, so construction is less likely an option.

There are mechanisms we can put into place whereby the sim sends a 'subscribe' notice to the client, indicating a new object of interest has entered the list of objects for this particular client, and can do that via UDP. And it will continue to do that, sending down a complete state until such time as the client acks this and says "Yes, I know there's a new object here, thanks" which it can do simply by acking the packet it received with that new objects information with in it. Then the sim 'knows' the client has that object and it can start sending just updates instead. You do the same thing for unsubscribing - the sim continues to send updates for a given object, as well as a "you are done with this object" message, until the client acknowledges it got it, in which case the object is just removed on the sim side and not considered again until it comes back into view.

There are some other things we can do as well in terms of the object update packets - Object Update packets should be mostly just deltas from a known state that the sim and client agree on - e.g. the initial state that the sim sent down to the client, which the client acked. The sim now knows what the client has and will only send down information that has deviated from that state. If, every now and then, when you get a smaller packet constructed for sending down, you can insert a new 'complete state' for an object to pad it out, which once acked brings the client and sim states up to one that hopefully should have less deviation between them because it's 'newer'.

Further to this there are other things to be tried, like not sending positional updates at all, but only sending velocity changes. If you send down a known state the client and sim have agreed on, then you don't need to send positional updates any more because the client and sim _should_ be in sync - all you need to do is send down velocity changes and the client and sim should remain in sync. Of course if packets are dropped, even though the velocity change is still being sent (and will be until the client acks it and the server says "Right, we are in sync right now") then this breaks some stuff since the client got the velocity change later than the sim started it, which means the dead reckoning position of the object is now wrong, but that's alleviated by the peridoric complete states being sent down. You can also send times on these packets and have the client re-interpret what the position should be given that the velocity change occured at time X (although this starts to involve some somewhat hairy code on the client side to make things match up). Simpler indeed, to just including a position when sending down a velocity change. You'd be surprised how effective that actually is - if you get to the point where so many packets are dropped that this become very apparent, then you are pretty much teetering on getting dumped anyway.

Most of the time good enough is surprisingly low in terms of real time correction and so on, because nothing particularly heavy is done client side that relies on this - it's all just visualization, rather than collision requirements and so on. The moment you start doing physics on a client, all this suddenly becomes far more important. But for now, it's less so.

Most of this is a known and solved problem; it's just time and resources that prevent us from looking at this immediately - that and the fact that while we see some artifacting now and again, it's not really something that destroys residents experience.