Difference between revisions of "LLSD"
Seebs Toll (talk | contribs) m (Correct a couple of typos.) |
|||
Line 14: | Line 14: | ||
- execution of older versions (with fewer parameters) | - execution of older versions (with fewer parameters) | ||
To this aim, the C++ API of LLSD strives to be very easy to use, and to default to "the right thing" whereever possible. It is extremely | To this aim, the C++ API of LLSD strives to be very easy to use, and to default to "the right thing" whereever possible. It is extremely tolerant of errors and unexpected situations. | ||
The | The fundamental class is LLSD. LLSD is a value holding object. It holds one value that is either undefined, one of the scalar types, or a map or an array. LLSD objects have value semantics (copying them copies the value, though it can be considered efficient, due to sharing.), and mutable. | ||
Undefined is the singular value given to LLSD objects that are not initialized with any data. It is also used as the return value for operations that return an LLSD, | Undefined is the singular value given to LLSD objects that are not initialized with any data. It is also used as the return value for operations that return an LLSD, |
Revision as of 18:01, 26 March 2007
The LLSD flexible data system
The following text is from the comments in the source of the file: linden\indra\common\llsd.cpp
Summary
LLSD provides a flexible data system similar to the data facilities of dynamic languages like Perl and Python. It is created to support exchange of structured data between loosly coupled systems. (Here, "loosly coupled" means not compiled together into the same module.)
Data in such exchanges must be highly tolerant of changes on either side such as: - recompilation - implementation in a different langauge - addition of extra parameters - execution of older versions (with fewer parameters)
To this aim, the C++ API of LLSD strives to be very easy to use, and to default to "the right thing" whereever possible. It is extremely tolerant of errors and unexpected situations.
The fundamental class is LLSD. LLSD is a value holding object. It holds one value that is either undefined, one of the scalar types, or a map or an array. LLSD objects have value semantics (copying them copies the value, though it can be considered efficient, due to sharing.), and mutable.
Undefined is the singular value given to LLSD objects that are not initialized with any data. It is also used as the return value for operations that return an LLSD,
The scalar data types are:
- Boolean - true or false
- Integer - a 32 bit signed integer
- Real - a 64 IEEE 754 floating point value
- UUID - a 128 unique value
- String - a sequence of zero or more Unicode chracters
- Date - an absolute point in time, UTC, with resolution to the second
- URI - a String that is a URI
- Binary - a sequence of zero or more octets (unsigned bytes)
A map is a dictionary mapping String keys to LLSD values. The keys are unique within a map, and have only one value (though that value could be an LLSD array).
An array is a sequence of zero or more LLSD values.
Scalar Accessors
Function: Fetch a scalar value, converting if needed and possible.
Conversion among the basic types, Boolean, Integer, Real and String, is fully defined. Each type can be converted to another with a reasonable interpretation. These conversions can be used as a convenience even when you know the data is in one format, but you want it in another. Of course, many of these conversions lose information.
Note: These conversions are not the same as Perl's. In particular, when converting a String to a Boolean, only the empty string converts to false. Converting the String "0" to Boolean results in true.
Conversion to and from UUID, Date, and URI is only defined to and from String. Conversion is defined to be information preserving for valid values of those types. These conversions can be used when one needs to convert data to or from another system that cannot handle these types natively, but can handle strings.
Conversion to and from Binary isn't defined.
Conversion of the Undefined value to any scalar type results in a reasonable null or zero value for the type.
Automatic Cast Protection
These are not implemented on purpose. Without them, C++ can perform some conversions that are clearly not what the programmer intended.
If you get a linker error about these being missing, you have made mistake in your code. DO NOT IMPLEMENT THESE FUNCTIONS as a fix.
All of thse problems stem from trying to support char* in LLSD or in std::string. There are too many automatic casts that will lead to using an arbitrary pointer or scalar type to std::string.
Attributes and Data
Attributes are only used for encoding parser and formatting instructions. The data in the elements is always data.
Root Element
The root element is llsd. The root must have only one child element which can be any container or atomic type.
Atomic Types
Each atomic type represents one value with type information. An atomic does not have a name, but may have attributes to specify format or processing considerations for the parser. Consumers of atomics are encouraged to massage the data into the preferred native representation, but further serialization should honor the original type information if possible.
undefined
The undefined type is a placeholder to indicate something is there, but it has no value, and cannot be converted to any other atomic type. Though limited in this way, an undefined is still considered a first-class atomic, and is expected to behave like any other atomic structured data type at runtime.
Serialization example
<undef />
boolean
A true or false value.
Conversion
type | rules | |
boolean | unity | |
integer | true => 1, false => 0 | |
real | true => 1.0, false => 0.0 | |
uuid | n/a | |
string | 'true', 'false' | |
binary | one byte us-ascii where true => 1, false => 0 | |
date | n/a | |
uri | n/a |
Serialization examples
<!-- true --> <boolean>1</boolean> <boolean>true</boolean> <!-- false --> <boolean>0</boolean> <boolean>false</boolean> <boolean />
integer
A signed integer value with a representation of 64 bits.
Conversion
type | rules | |
boolean | 0 => false, all other values => true | |
integer | unity | |
real | closest representable number | |
uuid | n/a | |
string | human readable string | |
binary | 8 byte network byte order representation | |
date | seconds since epoch | |
uri | n/a |
Serialization examples
<integer>289343</integer> <integer>-3</integer> <integer /> <!-- zero -->
real
A 64 bit double as defined by IEEE.
Conversion
type | rules | |
boolean | exactly 0 => false, all other values => true | |
integer | rounded to closest representable number | |
real | unity | |
uuid | n/a | |
string | human readable string | |
binary | 8 byte network byte order representation | |
date | seconds since epoch | |
uri | n/a |
Serialization examples
<real>-0.28334</real> <real>2983287453.3848387</real> <real /> <!-- exactly zero -->
uuid
A 128 byte unsigned integer.
Conversion
type | rules | |
boolean | null uuid => false, all other values => true | |
integer | n/a | |
real | n/a | |
uuid | unity | |
string | standard 8-4-4-4-12 serialization format | |
binary | 16 byte raw representation | |
date | n/a | |
uri | n/a |
Serialization examples
<uuid>d7f4aeca-88f1-42a1-b385-b9db18abb255</uuid> <uuid /> <!-- null uuid '00000000-0000-0000-0000-000000000000' -->
string
A simple string of any character data which is intended to be human comprehensible.
Conversion
type | rules | |
boolean | empty => false, all other values => true | |
integer | A simple conversion of the initial characters to an integer | |
real | A simple conversion of the initial characters to a real number | |
uuid | A valid 8-4-4-4-12 is converted to a uuid, all other values => null uuid | |
string | unity | |
binary | raw representation of the characters | |
date | An interpretation of the string as a date | |
uri | An interpretation of the string as a link |
Serialization examples
<string>The quick brown fox jumped over the lazy dog.</string> <string>540943c1-7142-4fdd-996f-fc90ed5dd3fa</string> <string /> <!-- empty string -->
binary data
A chunk of binary data. The serialization format is allowed to specify an encoding. Parsers must support base64 encoding. Parsers may support base16 and base85.
Conversion
type | rules | |
boolean | empty => false, all other values => true | |
integer | len < 8 => 0, otherwise first eight bytes are interpreted as a network byte order integer | |
real | len < 8 => 0, otherwise first eight bytes are interpreted as a network byte order double | |
uuid | len < 16 => null uuid, otherwise first sixteen bytes are interpreted as the raw binary uuid | |
string | the raw binary data interpreted as utf-8 character data | |
binary | unity | |
date | n/a | |
uri | the raw binary data interpreted as a utf-8 serialized link |
Serialization examples
<binary encoding="base64">cmFuZG9t</binary> <!-- base 64 encoded binary data --> <binary>dGhlIHF1aWNrIGJyb3duIGZveA==</binary> <!-- base 64 encoded binary data is default --> <binary /> <!-- empty binary blob -->
date
A specific point in time. Intervals or relative dates are not supported. The serialization and parser only understand ISO-8601 numeric encoding in UTC. The time may be omitted which will be interpreted as midnight at the start of the day.
Conversion
type | rules | |
boolean | n/a | |
integer | seconds since epoch | |
real | seconds since epoch | |
uuid | n/a | |
string | standard serialization format | |
binary | n/a | |
date | unity | |
uri | n/a |
Serialization examples
<date>2006-02-01T14:29:53Z</date> <date /> <!-- epoch -->
uri
A link to an external resource. The data is expected to conform to rfc 2396 for interpretation, meaning, serialization, and deserialization.
Conversion
type | rules | |
boolean | n/a | |
integer | n/a | |
real | n/a | |
uuid | n/a | |
string | standard serialization format | |
binary | n/a | |
date | n/a | |
uri | unity |
Serialization examples
<uri>http://sim956.agni.lindenlab.com:12035/runtime/agents</uri> <uri /> <!-- an empty link -->
Containers
Containers is a special data type which can contain any other data type including other containers.
map
A map of key and value pairs where key ordering is unspecified and keys are unique. The key is always interpreted as a character string and any character string is acceptable. If there are any elements in the map, it is serialized as a key followed by an atomic or container value. For every key, there must be one value. Well formed and valid serialized maps may contain more non-unique keys. When a deserialized, the implementation should choose one of the the value objects, but that choice is not specified.
Serialization example
<map> <key>foo</key> <string>bar</string> <key>agent info</key> <map> <key>agent_id</key> <uuid>93c73b16-cd86-434d-8b4a-76e12eee950a</uuid> <key>name</key> <string>testtest tester</string> </map> </map>
array
An ordered collection of data members. Any member can be any atomic or container type.
Serialization example
<array> <real>7343.0194</real> <array> <map> <key>offset</key> <integer>9847</integer> </map> <string>da boom</string> </array> </array>
xml-llsd DTD
<!DOCTYPE llsd [ <!ELEMENT llsd (DATA)> <!ELEMENT DATA (ATOMIC|map|array)> <!ELEMENT ATOMIC (undef|boolean|integer|real|uuid|string|date|uri|binary)> <!ELEMENT KEYDATA (key,DATA)> <!ELEMENT key (#PCDATA)> <!ELEMENT map (KEYDATA*)> <!ELEMENT array (DATA*)> <!ELEMENT undef (EMPTY)> <!ELEMENT boolean (#PCDATA)> <!ELEMENT integer (#PCDATA)> <!ELEMENT real (#PCDATA)> <!ELEMENT uuid (#PCDATA)> <!ELEMENT string (#PCDATA)> <!ELEMENT date (#PCDATA)> <!ELEMENT uri (#PCDATA)> <!ELEMENT binary (#PCDATA)> <!ATTLIST string xml:space (default|preserve) 'preserve'> <!ATTLIST binary encoding CDATA "base64"> ]>
Example XML Output
This is a sample from a recently running sim:
$ curl http://localhost:12035/runtime/statistics <?xml version="1.0" encoding="UTF-8"?> <llsd> <map> <key>region_id</key> <uuid>67153d5b-3659-afb4-8510-adda2c034649</uuid> <key>scale</key> <string>one minute</string> <key>simulator statistics</key> <map> <key>time dilation</key><real>0.9878624</real> <key>sim fps</key><real>44.38898</real> <key>pysics fps</key><real>44.38906</real> <key>agent updates per second</key><real>nan</real> <key>lsl instructions per second</key><real>0</real> <key>total task count</key><real>4</real> <key>active task count</key><real>0</real> <key>active script count</key><real>4</real> <key>main agent count</key><real>0</real> <key>child agent count</key><real>0</real> <key>inbound packets per second</key><real>1.228283</real> <key>outbound packets per second</key><real>1.277508</real> <key>pending downloads</key><real>0</real> <key>pending uploads</key><real>0.0001096525</real> <key>frame ms</key><real>0.7757886</real> <key>net ms</key><real>0.3152919</real> <key>sim other ms</key><real>0.1826937</real> <key>sim physics ms</key><real>0.04323055</real> <key>agent ms</key><real>0.01599029</real> <key>image ms</key><real>0.01865955</real> <key>script ms</key><real>0.1338836</real> </map> </map> </llsd>
Binary Serialization
We also have support for binary serialization and deserialization in c++ and python. The binary format is useful when dealing where optimal parse time is necessary. Binary LLSD is the binary llsd prefix followed by a single LLSD element of any type.
<?llsd/binary?>\n
type | serialization | notes |
---|---|---|
undef | '!' | |
true | '1' | |
false | '0' | |
integer | 'i' + htonl(value) | |
real | 'r' + htond(value) | |
uuid | 'u' + uuid | uuid is 16 bytes |
binary | 'b' + htonl(binary.size()) + binary | |
string | 's' + htonl(string.size()) + string | notation serialization is considered valid |
uri | 'l' + htonl(uri.size()) + uri | |
date | 'd' + htond(seconds_since_epoch) | |
array | '[' + htonl(array.length()) + (child0, child1, ...) + ']' | order is always preserved |
map | '{' + htonl(map.length()) + ((key0,value0), (key1, value1), ...)+ '}' | order is not always preserved. |
size() is a byte count.
length() is a child count.
htonl() is a function to generate a 4 byte network byte order integer.
htond() is a function to generate an 8 byte network byte order double.
Guidelines
XML Encoding
When possible, prefer using us-ascii or or UTF-8 xml encoding.
Questions & Things To Do
Would Binary be more convenient as usigned char* buffer semantics?
Should Binary be convertable to/from String, and if so how?
- as UTF8 encoded strings (making not like UUID<->String)
- as Base64 or Base96 encoded (making like UUID<->String)
Conversions to std::string and LLUUID do not result in easy assignment to std::string, LLString or LLUUID due to non-unique conversion paths.