Difference between revisions of "Open Grid Protocol"

From Second Life Wiki
Jump to navigation Jump to search
Line 7: Line 7:


* [[SLGOGP_Draft_1]]
* [[SLGOGP_Draft_1]]
* [[OGP_Draft_2]]

Revision as of 18:04, 30 July 2008

Note: This page references the current draft of the Open Grid Protocol.


Second Life Grid Open Grid Protocol

Draft 1
March 2008
Notice: This draft is for public comment.
Mark Lentczner (Zero Linden) Linden Lab zero.linden@secondlife.com
Copyright 2008, Linden Lab. All rights reserved.
This work is licensed under the Creative Commons Attribution-ShareAlike 2.5 license. See http://creativecommons.org/licenses/by-sa/2.5/ for details.
All contributions to this document must be contributed under the Second Life Project Contribution Agreement. See http://wiki.secondlife.com/wiki/Project:Contribution_Agreementfor details
Abstract
The Second Life Open Grid Protocol documents define the protocols by which a vast, Internet wide virtual world can operate. This protocol enables different regions of the virtual world to be operated independently, yet interoperate to form a cohesive experience.
Status
As of February 2008, this protocol is a work in progress. While the major structural elements have been designed, the specific elements for many needed resources have not. This work is being undertaken by Linden Lab and its community based Architecture Working Group.
This protocol is being defined in light of feature set of Second Life. For now, where this document is currently lacking details, familiarity with Second Life will be assumed. It is an express aim of this work to enable the live, gradual migration of Second Life to using this new protocol as the development proceeds.

discuss

Structure

discuss

Domains

This protocol is about a three way interaction between viewer, agent and region in order to facilitate a shared experience between people.

The viewer is the element that senses and acts on the state of the virtual world. The viewer does so from the vantage point of an agent. An agent is persistent identity and persona that interacts in a virtual world. The agent persists and can be interacted with even when the user controlling it (though a viewer) is off-line. Regions are persistent locations in the virtual world. Multiple agents may be present in a region at the same time, and when they are they have a shared experience.

Groups of regions and agents are managed by domains. A region domain is responsible for a collection of regions. An agent domain manages agent accounts.

This protocol makes few assumptions about how a domain manages its collection of elements. In particular, it does not assume that a region will be entirely managed on a single host, nor that an agent will or won’t be managed by a single process.

It is useful to think of the “stance” that each element takes in the three-way protocol:

The viewer is the direct proxy for a human that wants to control an agent. This control can be direct as in the case of an interactive 3D viewer, or indirect as in the case of a web site that the user directs to display their agent’s status.

The agent domain is responsible for the agent itself. The persistent state of the agent is held within the agent domain, and requests to interact with the agent, even by the viewer, are mediated by the agent domain.

The region domain runs the live simulations of regions in the virtual world. The region domain manages the persistent state of these regions.


discuss

Basic Flow

The basic flow of the protocol is:

  1. The viewer authenticates to an agent domain for the authorized control of a particular agent.
  2. The viewer directs the agent domain to to place the agent in a region.
  3. The agent domain contacts the region domain for the region, and negotiates placement of the agent.
  4. The region grants access to the agent domain, which in turn passes some of that granted access on to the viewer.

At this stage, each entity will have access to many resources in the other entities. For example:

  • The viewer has access to region resources that let it move the avatar.
  • The region has access to viewer resources that update the state of objects in the region.
  • The viewer and agent have access to resources in each other to facilitate text messaging.


discuss

Structure of the Protocol

The protocol is fundamentally composed of individual resources that can be invoked by one entity in the system upon another. Each resource is a member of a resource class that describes the syntax and semantics of invoking the resource. The bulk of this document, when complete, will describe the several hundred resource classes that make up the virtual world.

The resource classes are composed into suites that form logical groupings, though suites do not otherwise play a part in the protocol.

In order to facilitate migration from the currently running version of Second Life, legacy resources return information that allow entities to continue to communicate using the existing protocols and structures. These protocols and structures are not described by this document. It is the intention that when this work is complete, none of these legacy resources will be in use.

Agent and region domains have a few resources that are available at well known URLs. All other resources in the agent and region domains are accessed via capabilities obtained from the those few initial resources.

Since viewers are typically behind firewalls that do not allow connection, resources in the viewer are accessed by event queues held in the agent and region for the viewer. The viewer uses the “long poll” technique to efficiently proxy these inward resource invocations.


discuss

Base Protocols

discuss

Resources, HTTP & REST

All interaction between entities is through a client invoking a resource. Resources are invoked either directly via HTTP, or through an event queue.

For each resource class, this protocol defines how the client obtains the URL, the HTTP verb (or verbs) to be used, the request and response bodies (if any), and significant status codes. Resource classes are designed with REST style semantics.

In general, HTTP & REST are used as follows: The URL will be either well-known in advance or returned in a response from another resource. The latter is called a capability. Except for security reasons, URLs are always treated as opaque. Clients should not modify them. Parameters are never added to them via the query section. Resource handlers must be prepared to ignore query sections.

Resources follow general REST semantics and so respond to one of these verb sets:

GET

for cacheable resources

GET, PUT

for cacheable resources that can be modified

GET, PUT, DELETE

for cacheable resources that can be modified and deleted

POST

for non-cacheable resources

Unless otherwise stated, if a resource accepts PUT, it accepts multiple PUT invocations.

Note: We are considering having all resources support OPTIONS

The request and response bodies are transmitted as serialized LLSD data. If a resource has no response defined, then it can return either an undefined value, an empty map, or have a zero length response body.

HTTP status codes should only be used to indicate the status of the HTTP interaction itself. In general, if the resource is reachable, and the request understood, a 2xx code should be returned.

Note: Something about redirection - is it supported? Probably not…

HTTP headers, both for the request and the response are never part of a resource class definition. Headers are handled as per the HTTP standard.


discuss

LLSD

All data in this system is defined by LLSD. LLSD is an abstract way of talking about structured data.

Note: The defintion of LLSD should be inlined here. Until then, reference: http://wiki.secondlife.com/wiki/LLSD

Serialization

When used as part of a protocol, LLSD is serialized into a common form. At present, the only serialization is the XML LLSD serialization. In the future, other serializations may be supported, in particular binary LLSD and JSON. The serialization format is negotiated and indicated using the normal facilities of HTTP.

The MIME type for the XML serialization of LLSD is:

application/llsd+xml
Note: JSON only allows maps and arrays as top level elements. Do we need to impose the same restriction on LLSD as used in resource requests and responses?


discuss

Capabilities

This protocol makes extensive use of capabilities. A capability is an opaque HTTP (or HTTPS) URL used for accessing a particular resource. The provider of the resource has three logical parts: the grantor , the capability host , and the service .

The grantor uses the capability host to construct a capability that maps to the service that provides the resource, then returns that capability to the client. At some point in the future, the client invokes the capability which makes a connection to the capability host. The capability host then proxies to the service to provide the resource.

The parts that make up the provider may be separate entities or may be the same.

The client can’t invoke the resource without the capability. Typically the capability is a URL with a cryptographically secure path component. Within the capability host, this URL is mapped to the actual internal resource URL.

The client is free to hand the capability to other entities who become clients of the capability as well. Other than for the security considerations below, the client must not rely on any assumed structure of the capability URL.

Obtaining

For each resource a client wants to invoke, the capability must be obtained. In a few cases, the capability will have been expressly returned in the result of some other resource. Usually, the system uses a seed capability (see below) to request the capability for a given resource by name.


Invocation

To invoke a capability, the holder performs an HTTP transaction with the capability as the URL. The resource class the capability represents will dictate which verb (or verbs) can be used, and what the request and response bodies (if any) should be.


Lifetime & Revocation

Capabilities can be either unlimited or one-shot. Unlimited capabilities can be used multiple times, whereas one-shot can be used only once and are automatically revoked on invocation.

Note: If we support the OPTIONS method, then invoking a one-shot with OPTIONS does not revoke it.
Note: What about using HEAD on a one-shot?

Any capability can be revoked at will by the provider of the resource. Clients must be prepared to handle revoked capabilities. A revoked capability, when invoked must return a 4xx HTTP status code. The capability host may return a 404, even if the capability had been previously active.


Names

The resource a capability performs is identified by name. When requesting a capability, or when returning a capability, the opaque URL is identified with this name. The names of such resources are intended to be globally unique.

Names are URIs. When a name appears without a scheme component, then it is a relative URL, considered relative to the base:

http://xmlns.secondlife.com/capability/name/

While names do exhibit path-link structure, they are to be considered opaque identifiers. For example, while the following three capability names are indeed from the same sub-system, nothing should be inferred about a capability that starts with their common prefix:

  • inventory/root
  • inventory/folder_contents
  • inventory/move_folder

While not required, this protocol prefers names that are all lower case roman letters, separated by underscores.


Seed Capability (Resource Class)

In many cases, a sub-system will return a seed capability from which other capabilities can be requested.

Seed Cabability

Name

various, this is a generic resource and used in a variety of places

URL

various

Verb

POST

Request

{ capabilities: [ name1, name2, … ] }

an array of the names of the capabilities being requested

Response

{ capabilities: { name1: url1, name2: url2, … } }

a map from the names requested to granted capabilities.

The request contains an array of all resource names for which capabilities are desired. The response contains a map with an entry for each capability granted. Note: a grantor may grant all, some or none of the requested capabilities. The grantor may also grant additional capabilities that were requested, or none at all. If the grantor grants none, the response array must be empty and the HTTP status code should still be 200.


Security

If an end-point receives a capability from an untrusted source, it is permissible for security reasons to check the following aspects of the URL before use:

  • The scheme should be http: or https:.
  • The authority (in particular, the resolved host name) should not resolve to ports on the local machine that aren’t publicly accessible.


discuss

Event Queues

An event queue enables an entity to invoke resources in the viewer, which cannot be directly contact via HTTP. This is usually the case because the viewer is behind a firewall that doesn’t allow incoming TCP (and hence HTTP) connections from the region or agent domains.

In such a situation, the client establishes a queue of invocation requests for resources in the viewer. At the same time, the viewer uses an event_queue/get capability to effectively tunnel the requests from the client to itself.

Note: The event queue protocol described here matches what is deployed today, but is limited in functionality. It is expected to be superseded by a more general facility soon.

Restrictions

Resources accessed this way have the following restrictions:

  • Resources are identified by their resource class name. With capabilities, there can be several resources in an entity that all conform to the same resource class. With event queues only one resource can exist for each resource class within the viewer. This is not usually a severe restriction.
Note: The next three are temporary limitations in the current protocol and are expected to be removed.Resource classes that conform to these restrictions are equivalent to messages in the current Second Life protocol.
  • The only verb allowed is POST.
  • No response body is allowed.
  • The only status codes are 200, or 500 if the queue shuts down before all events are ack’d.


Requests

The event queue is an unordered list of requests. Each request is formatted as follows:

{ message: name , body: arbitrary-data }

Basic Flow

When the viewer invokes event_queue/get , the entity replies with the list of messages that have been queued up. The viewer takes the response, breaks it apart into a series of requests that it processes on itself, as resource invocations that the entity wanted to perform. When those invocations are processed, the viewer indicates in its next invocation of event_queue/get that the previous set was completed. While it takes two resource invocations of event_queue/get to tunnel a set of invocations in the other directions, subsequent transactions are chained, since the acknowledgement of a previous set of requests is performed in the same invocation that gets the next set.


Acknowledgement

The response to event_queue/get includes a sequence number for the batch of requests. When the viewer has processed a batch of requests, acknowledges them in the next invocation of event_queue/get . If unacknowledged, the event queue in the entity may either resend the same requests batched with any new requests, or may treat them as lost.


Long Poll

Both viewers and entities must be prepared to handle use the “long poll” technique to keep the flow of requests timely. Viewers must be prepared to handle that invoking event_queue/get may take a relatively long time to return, as the entity may choose to delay responding if there are no requests pending, or if it believes it would be better to wait for more requests to queue. Entities must be prepared to handle viewers that request as soon as they are ready for events with no delay. Both sides must be prepared to handle time outs and retries.


Closing the Queue

When the viewer is ready to terminate the queue, meaning that it wishes to be done accepting requests, it may signal such by including the done flag in the next invocation of event_queue/get . This value is purely advisory, but enables entities to cleanly flush remaining events, and release resources.


Event Queue Get (Resource Class)

This resource is a capability both in the agent and in the region, for implementing a tunneled series of resource invocations from the entity back to the client:

Event Queue Get

Name

event_queue/get

URL

capability

Verb

POST

Request

{ ack: sequence-number , done: done-flag }

On the first invocation for a given resource, sequence-number must be undef. If done-flag is true, then the client is signaling its intention to stop polling if there are no more events.

Response

{ id: sequence-number , events: [ requests , … ] }

See above for format of requests. Events may be empty.See above for format of requests. Events may be empty.



discuss

Login

discuss

Login

Login is the process of establishing the three way communication between viewer, agent and region. There are two phases, viewer to agent login, and then placement of agent in a particular region. Over the course of a viewer session, the agent may move around from region to region, and so constitutes a repetition of the agent to region login phase.

Note: At present, the second phase is primarily a proxied form of the current Second Life login, and so is implemented with a legacy resource.

Credentials

When the viewer log into the agent, it must authenticate itself by presenting credentials that prove the viewer has the right to control the agent. The log in system supports an extensible structure for providing credentials. Credentials are presented as an LLSD map, with the type key indicating the type of credential. There are two types of credentials defined.


Agent Credential

These credentials provide the password that is associated with a particular agent.

{ type: “agent”, first_name: first , last_name: last , password: password }

password is the string “$1$” prepended to the MD5 hash of the agent’s password

Account Credential

These credentials provide the password that is associated with a user account on the agent domain. The account may be associated with one or more agents.

{ type: “account”, account_name: account , password: password }

password is the string “$1$” prepended to the MD5 hash of the agent’s password

Note: This is not yet implemented.


Challenge-Response

Credential systems that are based on a password can be made to use a dynamically generated challenge value from the agent domain. In these cases, the viewer attempts login, omitting the password value, expecting it to fail. The failure contains the challenge to use in reattempting login. Then the viewer reattempts login, supplying the response computed from the challenge as follows:

{ type: “agent”, first_name: first , last_name: last , response: response }

{ type: “account”, account_name: account , response: response }

response is the SHA1 hash of the concatenation of the challenge, the password and the challenge again.

Choosing an Agent

The credential presented by the viewer may be valid for more than one agent. If so, then the viewer must specify the agent it wishes to control. If none is specified, and there are multiple possible agents, then log in will fail, and contain a list of possible agents. The viewer can then choose and reattempt login.


URL

Each agent domain must have a well known and published agent login URL.

The Second Life agent domain’s Agent Login URL is:

https://login.agni.secondlife.com/app/login/


Agent Login (Resource Class)

Agent Login

Name

agent_login

URL

well known URL per agent domain

Verb

POST

Request

{ credential: credential , first_name: first , last_name: last }

first_name and last_name can be omitted if either there is only one agent valid for the credential, or a list of possible agents is desired.

Response

{ agent_seed_capability: cap }

login was successful

{ reason: “select_agent”, agents: [ { first_name: first , last_name: last }, … ] }

credential was successful, but an agent must be selected, since the credential was valid for more than one agent.

{ reason: “notice”, redirect: url }

login was not successful; user should be shown the url


discuss

Legacy Login

Once logged into the agent, a viewer can then request that the agent be placed in a region via the legacy login resource. The legacy login resource performs three functions that in the future will be broken out into separate resources:

  1. Supply additional information about the viewer and the connection.
  2. Provide various information that the viewer needs on startup about the agent and the agent domain.
  3. Place the agent in the initial region.

Additional Login Information

The viewer supplies the following information:

channel
Each viewer belongs to a particular channel of viewers, such as “release”, “release candidate”, “first look”, etc…
version
The version string of the viewer.
platform
Identifies the operating system platform: “Lin”, “Mac”, “Win”
viewer_digest
The MD5 hash of the viewer executable. Only relevant when the channel is set to an official Second Life viewer.
id0
The hardware hash (based on the serial number of the first hard drive in Windows) used for uniquely identifying computers.
mac
The mac address of a network interface of the computer running the viewer.
agree_to_tos
If the user has read and agreed to the TOS
read_critical
If the user has read and dismissed a critical notification


Agent Domain Information

The viewer can request any or all of this information from the resource. Each section of information is requested by putting the information name in the options array of the request. The corresponding data is returned in the options map of the response.

inventory-root
UUID of the agent’s root inventory folder.

{ folder_id: folder-id }

inventory-skeleton
Initial list of folders in agent’s inventory. Returned as an array of three element arrays. Each three element array describes a folder with its name, its UUID, and the UUID of the containing folder.

[ [ name, folder-id, parent-id ], … ]

inventory-lib-root
UUID of the agent domain’s common inventory library folder. Returned in a different format:

folder_id

inventory-lib-owner
Agent ID of the owner of the agent domain’s common inventory.

{ agent_id: agent-id }

inventory_skel_lib
Initial list of folders in agent’s inventory. Returned as an array of three element arrays. Each five element array describes a folder with its name, its UUID, the UUID of the containing folder, its type, its version.

[ [ name, folder-id, parent-id, type, version ], … ]

gestures
List of active gestures. Each gesture is represented as a two element array with the inventory item id and the asset id.

[ [ item-id, asset-id ], … ]

event_categories
List of different event categories, mapping category id (an integer) to a category name. Returned as an array of two element arrays. Each two element array describes a category’s id and it’s name.

[ [ category-id, category-name ], … ]

event_notifications
List of events for which the agent has pending notifications. Returned as an array of eight element arrays. Each eight element array describes an event notificiation with its id, name, description, date, x & y of the region, and x & y within the region.

[ [ event-id, name, description, date, grid-x, grid-y, x, y ], … ]

classified_categories
List of different classified categories, mapping category id (an integer) to a category name. Returned as an array of two element arrays. Each two element array describes a category’s id and it’s name.

[ [ category-id, category-name ], … ]

buddy-list
List of friends with granted and given rights masks. Returned as an array of three element arrays. Each three element list has a friend’s agent id, granted rights mask, given rights mask.

[ [ buddy-id, granted-rights, given-rights ], … ]

ui-config
Setting of some user interface states for the agent. Only one setting is defined: If the first life tab should be shown.

{ allow_first_life: enabled }

login-flags
Several flags about the state of the agent.

{ stipend_since_login: received-stipend , ever_logged_in: ever-logged-in, gendered: choosen-a-gender, daylight_savings: in-daylight-savings-time }

global-textures
The asset ids of several global textures:

{ sun_texture_id: asset-id, mood_texture_id: asset-id, cloud_texture_id: asset-id }


Initial Region Information

The initial region is specified as either as the agent’s home region, the agent’s last region, or as a specific region. The specification is a string with the values “first”, “last”, or the SLURL form of the region.


Result

If the login is successful, the result contains several collections of information.:

login
set to true if successful
message
Message of the day string
seconds_since_epoch
“Second Life” time
first_name, last_name, agent_access, agent_id, inventory_host
Information about the agent
circuit_code, session_id, secure_sessions_id
Identifiers used in the legacy message system
start_location, look_at, region_x, region_y
Initial placement information
seed_capability, sim_ip, sim_port
Connection information to initial region


Legacy Login (Resource Class)

Legacy Login

Name

legacy/login

URL

capability from agent domain

Verb

POST

Request

{ start: start-loc, channel: channel , version: version , platform: platform , mac: mac-address, id0: system-hash, agree_to_tos: flag , read_critical: flag , options: [ section-name , … ] }

See above for descriptions of these values.

Response

{ login: true, message: mtod , seconds_since_epoc: time , first_name: first-name , last_name: last-name, agent_access: agent-access , agent_id: agent-id, inventory_host: host-name, circuit_code: circuit , session_id: session-uuid, secure_session_id: secure_session_id, start_location: start-loc, look_at: rotation , region_x: x , region_y: y , seed_capability: region-seed-capability, sim_ip: ip , sim_port: port }

See above for descriptions of these values.

All Drafts