AWG Scalability through reverse proxies: the paravirtual grid

From Second Life Wiki
Jump to: navigation, search

Overview

The emphasis on a cached proxy layer has not been significant enough. That feel grows out of the suggestions (from group chat) that the reverse proxy has not been seen as an essential part of the architecture. That may be true if one looks at a plain proxy as only to forward services, but that would be the forward proxy; hence, there is a big difference between forward and reverse proxies. A reverse proxy can actually be an end-point to create an authoritative response instead of just forward the request. Strategically placed reverse proxies create potential end-points before potential traffic bottlenecks. How the actual reverse proxy is implemented is not the architecture side, but the strategically defined placement and the potential separation (created by the end-point) from the grid is key to the architecture. These end-points act like mini-grids where, at worst, the main grid must forward states (by a poll) to the reverse proxy to keep the cache updated, which allows further authoritative responses by the reverse proxy to the viewer. Now, if this is clear enough, you'll see how reverse proxies can be added now, except that it still needs the authentication (WVA) layer. This strategic placement also justifies that other REST/physics related use case (and like) are actually implementation and not architectural. You should see that REST/cHTTP becomes more architectural.
The reverse proxy-cache when fully effective can localize the events, which creates the illusion of a mini-grid or a virtualization of the main grid. This actually becomes more paravirtual if done correctly. For example, one reverse proxy may serve an entire corporate organize for local events. If the virtualized sim (By the reverse proxy) is only used by the corporate locally with no other non-corporate viewers outside, the virtualized sim can be handed off the the reverse proxy, which the proxy becomes a paravirtual sim in such state. When a sim becomes paravirtual, the main grid essentially no longer needs to manage the sim until it is handed back from the reverse proxy to the to the main grid. The corporate organization here is just an example only for a location of the reverse proxy.
The reverse proxy-cache may hosts many sims for several corporations or for a real geographical location. In essence, it then becomes a paravirtual grid. Here the reverse proxy needs proper authentication in order to ensure proper digital rights management of the content being seen by the reverse proxy.
Since more than one paravirtual grid may exist, the potential means to directly communicate from one paravirtual grid to another paravirtual grid avoids traffic events directly to the main grid. Two paravirtual grids that communicate directly with each other can essentially mean zero cost at the main grid for such communication.
The paravirtual grid is one measure for scalability.

Paravirtual vs. Virtualization

Paravirtualization takes virtualization a step further in that an entire virtual machine is created on top of all the virtualized components. In the sense of virtualized hardware, a pure paravirtualization exists only as a software interface layer. The real hardware layer becomes completely transparent to the virtual machine, and that includes any physical attribute associated with such use of the hardware being also transparent.
The high level of transparency allows the virtual machine to exist in any paravirtual domain that have common APIs. This allows the state of any virtual machine to be moved or cloned to another paravirtual domain.

Immediate Integration

The reverse proxy works in a stateless mode by default; however, modern reverse proxies also have a stateful mode due to the presence of the cache. The Agent Domain is an immediate target to integrate due to the stateless nature of that domain. The cache allows storage for (quasi-)static asset information, which potentially reduces load on the main grid. Web like interfaces and some Region Domain services integrate to a degree due to the cache. That degree is comparable to a paravirtualized simulation except mainly for the physical related events. This design can be implemented now except we need to complete the authentication layer (OpenID) first for it to be complete. The reverse proxy is significant here to move to a future design.
AWGReverseProxy.jpg
The reverse proxies are outside of the main grid and put closer to the viewers. Quasi-static data can be active completely outside of the main grid, which leaves the grid to mainly handle non-static data, like frame states. The drawn purple-ish lines represent a step to decentralize the current central database, which still exists. The agent domains and regions domains, as in the Proposed Architecture, are blurred virtualizations from the grid due to the nature of the reverse-proxy cache.

Proxy URLs

The viewer needs to determine the appropriate proxy URL. This can easily be done with the current URL scheme for the grid with one extra subdomain level. For example:
A) sim4802.agni.lindenlab.com
B) sim4802.us.agni.lindenlab.com
C) sim4802.eu.agni.lindenlab.com
URL "A" points directly to the simulator, as it currently does now. URL "B" points to a proxy in the United States. URL "C" points to a proxy in Europe. The simulator or the proxy may redirect the viewer to the best URL. This URL scheme allows a simple format to reshape or redirect network traffic on demand. Each URL above, AB&C, all access the same simulation (sim4802).
A given use case to compare against redundancy with this proxy URL scheme: Connectivity Issues from Europe
To help such case, proxies can detect network traffic issues and redirect as needed. The extra subdomain need not be limited by name. For example, there could be us0, us1, us2, eu0, eu1, eu2, proxy0, proxy1, proxy2, etc.

Future Integration

The conceptual design is based on well known paravirtualized machines for further integration. In this design, we paravirtualize the grid and not just part of the simulation. The immediate implementation allows us to step more gradually to this future design. We create the common paravirtual API structure and the virtual machine that computes the simulation. We could potentially use an already well known paravirtual machine or create one specifically to run the simulation. The advantage of a well known paravirtual machine carries general stability. The hardware layer is completely abstract and no longer a part of the architectural grid design. The state of any simulation is transportable across the paravirtual grid. The reverse proxy, at this point, has evolved to a virtual machine.

The First Decentralized Agent Domain

AWGDecentralizedAgentDomain.jpg
After the reverse proxies are in place, the Agent Domain is separable from the central database. The use of the proxies allows placement and activation of these domains to be done in incremental movements. As agent data is moved from the central database to the Agent Domain, the proxies help redirect to where to find the data.

The Region Domain Metamorphosis

AWGDecentralizedRegionDomain.jpg
Given the event when all Agent data has been moved out of the central database, the basic Region Domain forms from the decomposition. The proxies are now ready to evolve to create the paravirtual grid.