User:Dzonatas Sol/AWG Virtualization

From Second Life Wiki
< User:Dzonatas Sol
Revision as of 07:46, 23 March 2009 by Dale Innis (talk | contribs) (→‎Benefits: Add "Concerns" section, pointing out that it probably can't be done)
Jump to navigation Jump to search

Note: This page was a part of Brainstorming#Virtualization. It was an attempt to organize sections and areas of interest.

Motivation

The decision to map virtual regions to physical servers statically in the first version of Second Life has hardwired a number of implementation choices that impact on performance, resourcing and scalability. Roughly, and very simplified:
  • Passive land acreage (and cost) became limited by (and linked to) local disk storage.
  • Available prim counts became limited by local CPU power, and hence are non-scalable.
  • Concurrent scripting load became limited by local CPU power, and hence is non-scalable.
  • Server-side scalability for events became limited by local CPU power and bandwidth, and hence is virtually blocked.
  • Hardware utilization and overall scalability plummetted, as unused local resources cannot assist elsewhere.
  • The need for region handover at sim boundaries created numerous complexities and opportunities for problems to arise.
  • Asset serving and many other services could not scale automatically with population growth nor with grid expansion.
  • Local resourcing means that improvements made to a large grid cannot benefit its residents when visiting an attached grid.
The new focus on decentralization and scalability offers an opportunity to overcome all of these issues, and provides the motivation to do so. However, to achieve this does require relinquishing attachment to the conceptually simple but inherently flawed approach that is static mapping of virtual resource usage to physical resource provision.

Evolution

Fortunately, creating a dynamic infrastructure is not inconsistent with the current implementation. The following observations highlight this:
  • A dynamic server farm still requires some static physical topology, and a 2D grid is not horribly suboptimal at current levels of scaling.
  • Additional networking components are easy to attach to an exiting network if reduced latency is desired for dynamic services.
  • Existing grid servers running local sim services could trivially offer their unused resources for non-local use.
  • Very busy sims would degrade an additional dynamic infrastructure gracefully, simply by not contributing to it.
This suggests that an evolutionary transition to a dynamic infrastructure is possible, alongside normal operation of the static grid. It is hard to predict how issues like economics might faire alongside infrastructure that is evolving towards less-constrained dynamic resourcing, but then nothing at all can be predicted anyway about the future in a system that is being scaled to the scary numbers offered in Project_Motivation.

Design principles

In some ways, a dynamic resourcing architecture is simpler than a static one, because resource boundary issues do not need to be considered as in the static design. More importantly however, a bigger simplification occurs a little down the road, because static systems simply do not scale along all dimensions and therefore evolution rapidly hits a brick wall. This means that the simpler static system does not really address the right problem at all, once all-axis scalability is added to the requirements. In contrast, dynamic systems are inherently expandable, even along unforseen dimensions because new tasks are handled as easily as old ones.
The underlying principles for a dynamic design are relatively simple, and not hard to implement: (*)
  1. Make all work requests stateless (a region for example becomes merely a request parameter).
  2. Throw all work requests into a virtual funnel (aka. multi-priority task buffer and serializer).
  3. Place all servers in a worker pool, ready to take tasks out of the funnel as they appear.
  4. Hold all persistent world state in a distributed object store which is cached on all servers.
  5. Workers run a task, update world state if required, and send events which may create more work.
Although it is a departure from the static design, there is nothing really radical in such a scheme, and many standard software components can be used to implement it, including the very efficient and easily managed web-type services.
(*) Note that these are only principles. In an actual system design, this kind of scheme is repeated at many levels, in breadth and depth.

Benefits

A number of important improvements are gained immediately from such a dynamic infrastructure:
  • Regions scale arbitrarily for events, up to the limit of resource exhaustion of the entire grid.
  • The limits to desired scalability in any dimension are determined by policy, not by local limitations.
  • Server death does not bring down a region, and the overall grid suffers only graceful degradation.
  • Adding/upgrading grid server resources benefits the entire grid rathering than favouring one region.
  • Regions have no physical edges, so complex functionality is avoided and no sim handover problems can occur.
  • Regions can have any size or shape of land whatsoever, and land cost (not price) drops to that of disk storage.
  • Available prim counts are no longer tied to land acreage, allowing both highly dense and highly sparse regions.
  • Events can be held in previously unused areas of the world (eg. regattas in mid-ocean) without prior resourcing.
  • Small remote worlds and subgrids can be visited by large-grid residents, because local caching still operates.

Concerns

Dynamic and distributed systems, especially at a large scale, are harder to implement than static centralized ones, and in general pose significant performance and reliability challenges. The one proposed here is no exception.
  • The feasibility of this solution will be gated by the performance and reliability of the distributed object store. Massively distributed object stores are hard.
  • There will be a tradeoff between reliability (with distributed locking and transactional semantics) and performance (because those things are expensive, particularly in terms of network latency). It's not immediately clear that any feasible point on the tradeoff curve will result in acceptable performance at the user-perceived level.
  • If the state data for an entity (object or AV) is no longer stored very close to the state data for nearby entities, the amount of network and disk traffic and computation required to find all of the potentially interested entities for a given state change increases dramatically; caching is only somewhat helpful here, since caching is heuristic (enforce non-heuristic caching is effectively a return to region servers, under another name).
  • To first order, the distributed object store will need to be a distributed pub/sub system with at least thousands of subscribers, at least millions of publishers, very frequent subscription changes, and thousands or millions of publications per second, with significant (although not terribly stringent) consistency properties (and with quite strong eventual-consistentcy properties), and with very demanding latency properties. The difficulty of building such a system cannot be overstressed.
  • This is not trivial stuff.