Talk:Json usage in LSL

From Second Life Wiki
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

I am concerned because the JSON format specifies at that you can use escape codes like \u0000 to represent Unicode byte values in strings. But LSL has the ugly habit of censoring and altering strings so that a character with byte value of 0x0000 is removed from the string, and some functions like llSHA1String are essentially broken since they also convert UTF-16 integers between \u0128–\u0255 into UTF-8 byte values starting with %c2 (which therefore have a different integer value due to the addition of the extraneous byte). Is LSL also going to mangle JSON strings' byte values when it renders them into LSL strings? Won't this corrupt attempts at efficiently verifying the signatures of any incoming messages, and thwart attempts to generate proper signatures for some outgoing JSON-formatted requests? Or do the Lindens have plans to finally give us a proper suite of escape codes in LSL (or some other solution)? --Gistya Eusebio 09:41, 30 May 2013 (PDT)

Add to that the complete crash-and-burn if your string starts with a quote, generating invalid JSON. I really don't get why LL finds it advantageous to include magic switches which forces everybody to do workaround for normal use, and ensures that all future JSON implementations in SL must be hand-coded to keep backwards compatibility with the spec breaks currently implemented. Tali Rosca 15:50, 19 June 2013 (PDT)

This sounds like a bug, you should report it. -- Strife (talk|contribs) 21:49, 20 June 2013 (PDT)
nm I see that you did. BUG-2594 -- Strife (talk|contribs) 21:50, 20 June 2013 (PDT)
I've been kicking and screaming about getting a robust JSON handling :-) 2594 got us some way, but we still have BUG-2736, which I consider so exceedingly ill-advised as to be a bug, despite the insistence that it's a really awesome feature. Tali Rosca 16:39, 21 June 2013 (PDT)
     The trailing quotation mark is the real nasty problem, not the leading one :D But yeah.. if they do add the capability to handle enquoted* text, how would that break anyone's scripts? Is anyone really relying on enquotation to invalidate their JSON strings on purpose? Besides when has breaking people's scripts stopped them from doing anything? I have tens of thousands of L$ worth of vehicles that are now worthless due to the Havok 4 update, but hey, life goes on. My copy of Microsoft Word 1.0 for Mac won't run on Mountain Lion either. I like backwards compatibility but we developers would be out of a job if software never had to be rewritten to work with the ever-evolving platforms that are out there :D
     --Gistya Eusebio 07:58, 23 June 2013 (PDT)
* I know "enquoted" is not in the dictionary. However it is in the lexicon. :D
A (somewhat kludgy) workaround for the mishandling of escaped characters can now be found here. Hope someone finds this useful.LepreKhaun Resident 23:41, 29 August 2013 (PDT)

I feel the paragraph within "Specifying Json Elements" that begins with "When JSON is presented in human-readable form,..." should be rewritten to show that the JSON string in its entirety is actually the root node of the structure, signified by an empty list used as "specifiers". LepreKhaun Resident 15:23, 6 July 2013 (PDT)

Unsure of major edit

After making the edit on 16:04, 15 August 2013 to correct code examples that wouldn't compile, I've come to realize that the given examples, as well as the surrounding text, is in error since JSON disallows empty values, as one might find in a "sparse" array.

The author is pointing out "a rare exception" but then uses "{\"parent\":,}" and "{\"parent\":[ , , , , ]}" to illustrate. Neither of those can be arrived at with llList2Json(), since LSL doesn't allow empty list elements, but can only be obtained by hand coding to arrive at the non-compliant JSON strings.

I feel that whole section should be rewritten to simply point out to the reader that they are advised to use llList2Json() in the formation of JSON strings and avoid hand coding, which may well result in mal-formed constructions that could lead to confusing results in later operations.

However, doing so would excise a number of paragraphs and I'm unsure how that would set with the original author and others. Guidance on this would be appreciated. LepreKhaun Resident 05:25, 18 August 2013 (PDT)

JSON example

I want to submit a simple, working example on LSL JSON, but I don't know the right place for it

I hope Strife will and can place it, here it is:
// JSON array forum example by Dora Gustafson, Studio Dora 2013
// Building a 3 by 5 array in a JSON Object
// Rows are indexed by name and columns are indexed by number = 0,1,2,3,4

string JSONVotes;

tellVotes(string voter)
    string Js = llJsonGetValue(JSONVotes, [voter]);
    list Jl = llParseString2List(Js, [",", "[", "]", "\""], []);
    string output = llDumpList2String(Jl, ", ");
    llOwnerSay("Votes from " + voter + " are: "+output);

    {   // Building the JSON object
        string votes = llList2Json(JSON_ARRAY, [0, 0, 0, 0, 0]); // one row
        JSONVotes = llList2Json(JSON_OBJECT, ["Betty", votes, "Jerry", votes, "Pierre", votes]); // complete object
    touch_end( integer num)
    {   // Testing the JSON object
        // saving some random votes
        JSONVotes = llJsonSetValue( JSONVotes, ["Betty", 1], (string)llFrand(100.0));
        JSONVotes = llJsonSetValue( JSONVotes, ["Jerry", 4], (string)llFrand(100.0));
        // testing
        // getting one vote, example
        string s = llJsonGetValue(JSONVotes, ["Betty", 1]);
        llOwnerSay("Betty votes " + s + " in second column");

Dora Gustafson 06:23, 25 August 2013 (PDT)

Inconsistent behavior on types

The documentation says: if the specifier list element is an integer, then the corresponding element in the JSON value must be an array and the list element is used as a zero-based index into the JSON array.

However, placing integers in a specifier used with a JSON_OBJECT and llJsonGetValue, they are automatically converted to strings. llList2Json and llJsonSetValue do not automatically convert integers to strings. Both will result in a JSON_INVALID if you try to put integers on the key side of a JSON_OBJECT.

This example code, according to the documentation shouldn't even work (I'd presume it's supposed to return a JSON_INVALID):

string test = llList2Json(JSON_OBJECT, ["1", "one", "2", "two", "3", "three"]);
llOwnerSay(llJsonGetValue(test, [3]));

This little sniplet produces "three". Additionally, using llJsonValueType in place of llJsonGetValue in the above example will result in JSON_STRING. Attempting something like 'llJsonSetValue(test, [3], "three");' results in JSON_INVALID.

Chetar Ruby 09:15, 14 October 2013 (PDT)

Yes, that's because what you meant to say was llJsonSetValue(test, ["3"], "three")– objects in LSL JSON land are always keyed on STRINGS– you can't pass numbers even when the string looks like a number. -- Winter Seale 12:35, 23 January 2014 (PST)
Right, but you're overlooking what was being pointed out, a string isn't being required using llJsonGetValue() on a JSON object. It's an anomaly, though I can't see how this might adversely affect anything. LepreKhaun Resident 14:29, 24 January 2014 (PST)

Major change to the last part

I've made a major change per BUG-6280; the change is - can someone please review it for accuracy? --Sei Lisa 18:09, 6 June 2014 (PDT)

Review is not the same as undo. I expected some commentary on my edit, not a full reversal. Why is there an example using undocumented features, and why has the explanatory text that goes with the example been removed? --Sei Lisa 11:21, 9 June 2014 (PDT)
I'm reverting Miranda's changes for several reasons.
1) The justification for the removal of the paragraph is redundancy. Nowhere in the article explains how the strings are interpreted when they aren't quoted inside the string itself (like the LSL string "\"true\""). The text merely says: "LSL strings which both begin and end with "\"" are interpreted literally as JSON strings, while those without are parsed when converted into JSON." But it does not explain what it means by parsed. It's very ambiguous, and it's not clear at all from that how it acts. The addition of a text that explains how the strings "true", "false" and "null" behave is therefore NOT redundant.
2) There's no example of how to properly encode strings. Having an example that omits the quotes around the string values may lead to think that that's the norm, when that is undocumented behavior (nowhere in the article says what llList2Json or llJsonSetValue actually do with strings that don't decode to any valid JSON type - they happen to be translated to JSON strings, but some of these strings like "true", "false", "[]", etc. will not, and that's not exaplained).
3) The paragraph was added in this edit: - note how the example, initially written by Maestro, was buggy to the point that it used TRUE instead of JSON_TRUE, therefore the explanatory text was not redundant. It was merely not totally accurate, and I believe my edits made it clearer.
--Sei Lisa 11:58, 9 June 2014 (PDT)

Text incorrect

This section looks factually incorrect:

If, anywhere along the line, any of the specified nodes do not exist, llJsonGetValue and llJsonValueType will almost always return JSON_INVALID. However there is a rare exception: if any specifier in the list leads to a node that has no characters in its JSON string value, JSON_NULL will be returned instead of JSON_INVALID — no matter what might come after it in the specifiers list. For example in:

string example = "{\"parent\":,}"; 
string test1 = llJsonGetValue(example, ["parent"]);
string test2 = llJsonGetValue(example, ["parent that doesn't exist", "etc."]);

test1 and test2 will both be JSON_NULL.

The same is true for arrays: a specifier pointing to any element of a valid array will return JSON_NULL anywhere there is no JSON value specified at given position in the array:

string example = "{\"parent\":[ , ,  , , ]}";
string test1 = llJsonGetValue(example, ["parent", 2]);
string test2 = llJsonGetValue(example, ["parent", 1, "child that does not exist", "etc."]);

test1 and test2 will also both be JSON_NULL.

These types of scenarios may be useful in determining valid JSON paths for the setter: in all the scenarios where JSON_NULL is returned, the same set of specifiers is able to be successfully used for llJsonSetValue. However if JSON_INVALID is returned, there is no guarantee the setter won't return an error.

When I run this script:

        string example = "{\"parent\":[ , ,  , , ]}";
        string test1 = llJsonGetValue(example, ["parent", 2]);
        string test2 = llJsonGetValue(example, ["parent", 1, "child that does not exist", "etc."]);
        example = "{\"parent\":,}"; 
        test1 = llJsonGetValue(example, ["parent"]);
        test2 = llJsonGetValue(example, ["parente that doesn't exist", "etc."]);

I get:

[07:58:42] Object: %EF%B7%90
[07:58:42] Object: %EF%B7%90
[07:58:42] Object: %EF%B7%90
[07:58:42] Object: %EF%B7%90

that is, JSON_INVALID four times, I wonder if it's a change of mind or a bug. However, given the attention (or lack thereof) I've received in my bug report in BUG-6495 I'm unsure if reporting it as a bug is a good idea. Maybe that section should just be removed? --Sei Lisa 08:07, 2 July 2014 (PDT)

Performance issues

I did some performance testing, comparing the use of string, list, and JSON functions. Detailed results are on my page (click my name below), but the summary is:

  • List searching and extraction is MUCH faster than llJsonGetValue (well over an order of magnitude)
  • Creating JSON strings by hand or through string concatenation is FAR more efficient than using llList2Json.
  • For extracting multiple values from a JSON object, it's substantially faster (but probably not an order of magnitude faster) to use llJson2List once, then extract data from the list.
  • The LSL string manipulation functions (llSubStringIndex, llGetSubString, llDeleteSubString, llInsertString) are individually slower than JSON functions, but for performing any full operation on a JSON string (i.e. search for an old value, replace it with a new one), the JSON functions almost certainly are more efficient.
  • Compared with using lists, using JSON strings is a lot slower. However, it may be more efficient in memory usage, and certainly can make for much more readable code and data, and this may be sufficient to offset the performance penalty.
  • When using list functions, let LSL convert a string to a one-element list automatically (list2 += "a" beats list2 += ["a"])
  • [[llList2Integer]](...) is faster than (integer)[[llList2String]](...)

I hope the new virtual world LL recently announced it's working on will have a scripting language that supports real object datatypes... hope hope... Brattle Resident 06:09, 11 July 2014 (PDT)

Why, oh why, have these things always such a messy implementation? *sigh*

I'm aware that nobody dared to make a comment against the current JSON implementation in LSL in the past decade or so, so let me begin, in the very vague hope that, one day, one Linden reads this and thinks twice.

Let's start with basics. When LL started doing their first implementations to allow communications between SL and the external world, they used... email. The point being that email is "fast enough" (to send back statistics and such), but, more importantly, you can just do all the messaging with plain ASCII on a single line, with a few separators. That's great, since that can easily be fed into a list, the only 'compound type' they deemed to support (nothing against lists, mind you — so long as some of the functions retrieving information from them are as fast as, say, retrieving a memory position from an address [there is some sarcasm in that sentence, if you failed to notice it]).

Then came XML-RPC, in 2004 if my memory doesn't fail me. XML-RPC is tricky, because XML is very chatty, comparatively speaking, and notoriously hard to parse (except for the most basic of commands). Still, it had the advantage to be reasonably standardised, and there existed lots of popular libraries to deal with it, so that would make both their development, as well as the external development by scripters. Thanks to XML-RPC, we got what is today the Marketplace, which is an astonishing accomplishment, considering how it was first set up.

Both email and XML-RPC had a major drawback, however: they went through a single gateway, which became the bottleneck for messaging, and, once these servers went down, there would be no communications any longer. Clearly, something new had to be implemented.

LL also recommended that XML-RPC messages were not fully parsed as XML, since, well, XML will take way too much memory and require a lot of processing time to decode. So, instead of providing ready-to-use XML-parsing functions (which would be the reasonable route to go, even before the service went public), LL suggested that external servers dealt with all complex communications, "translating" them to plain, field-separated, compact ASCII one-liners. Those, again, would easily be parsed by the existing list functions, and, therefore, LL didn't need to worry about introducing more complex types. It also meant that all communications were in plain text, simply because it was unpractical to implement even the simplest encryption algorithms in LSL (you can do that as an exercise, but don't expect it to be efficient enough to handle many requests...); you got MD5 after a time, but, by then, MD5 was already being phased out for anything relying on the least amount of security.

With bare-bones HTTP, both outbound and inbound, LL made this even easier: since now you had full control over the format of your home-grew communications protocol, well, then it could all be simple one-liners, fitting in (at most) 2 KBytes (or thereabouts), because that was enough to do some simple messaging (in near-to-real-time, as opposed to the email-out/XML-RPC-in alternative). If you wanted something more complex, field-separated structured (ASCII) data could bridge the gap between 'serious' Web services and Second Life.

The problem is that this wasn't (even by the standards back then!) the way most communications worked. These relied on well-known, structured data (such as XML, and then JSON, and so forth), which could then be directly converted to objects in the programming language of choice, thus making the whole communication process very abstract: you just sent an object over the 'net and expected the other end to receive it and decode it to allow communication. That works on almost all contemporary programming languages (in one form or the other), but, of course, when you just have one compound/complex type, that's asking too much from LSL.

LSL just told scripters to simply do the kind of translations between communication protocols on the server side, not on LSL. As a bonus, we started getting HTTPS communications, and could even produce SHA256 hashes without crashing the LSL virtual memory, so a semblance of secure communications could be implemented.

While there are many alternatives these days, JSON became a de facto standard, mostly because JavaScript is the programming language running on more than 3 billion devices, and the very least, communicating with JSONifed objects should be an acceptable alternative which 'works everywhere'. Well-intentioned scripters therefore demanded that LL supported 'at least' JSON — never mind about things such as BSON, MessagePack or Protocol Buffers, or even YAML, or, well, "Wikipedia logo"any one of the hundreds of serialisation formats out there.

Anyway. Enough ranting. My point here is just to acknowledge JSON's prevalence over "anything else" used for communications (at least at the time I'm writing this), with XML (probably) on the second place. We most definitely need the ability to read & write JSON and do it efficiently. That's the main point here: efficiently. What we have right now is a mess which works (sure) but it requires so many workarounds to deal with the many possible issues, that it's (currently) easiest to simply parse the (relevant) data manually...

While I'm quite aware that it's hard to implement JSON marshaling/unmarshaling in a programming language that has just one non-basic type (namely, lists...), I believe that the LL-proposed suggestion of assigning JSON objects to LSL lists is not a good idea — it adds several layers of complexity in order to deal with the lack of recursive structures in LSL

Instead, it might be far easier to support and maintain a new kind of object instead — call it (why not?) the json type. It would be completely opaque to programmers (like lists are), and just support a X-Path-like convention to address items inside it (get/set/replace/delete); now that LL brought some regular expressions into LSL, it could easily be adapted for X-Path access to JSON objects. And these would also be easily marshalled/unmarshalled (or encoded/decoded) into "normal" strings, so that no extra functions/variables would be required to call those.

Something like this (no error checking whatsoever, but you get the idea):

string callURL = "https://my.api.URL/that/returns/json";

        // We make the request using a string with JSON
        string request = "{ 'apiCall': 'fake', 'ownerUUID': '" + llGetOwner() + "' }";
        handle = llHTTPRequest(callURL, [
                HTTP_METHOD, "POST",
                HTTP_MIMETYPE, "application/json",
                HTTP_ACCEPT, "application/json"

    http_response(key request_id, integer status, list metadata, string body)
        if (request_id == NULL_KEY || request_id != handle)
            return; // exit if unknown
        if (status == 200)
            // We got the expected response! The body ought to have a JSON string.
            // Imagine we receive something like:
            // {
            //     "status": 200,
            //     "message": "Owner found!",
            //     "data": [{
            //             "item": "I have this item in inventory",
            //             "itemdate": "2015-01-12",
            //             "uuid": "a1f3f9f1-49da-4a5c-a719-f66a498245c2"
            //         },
            //         {
            //             "item": "This is a different item altogether",
            //             "itemdate": "2020-03-12",
            //             "uuid": "3c27cfdb-f5e9-4e9d-99f8-25f3ffd38fe2"
            //         },
            //         {
            //             "item": "I also have this one in inventory",
            //             "itemdate": "2020-02-20",
            //             "uuid": "3060fa30-e59e-45ee-8746-455e4fde7d68"
            //         },
            //     ],
            //     "timestamp": "2016-08-14T02:34:56-06:00"
            // }

            // create new JSON object, using the body as a string to pass to the proposed function call.
            json myJSON = llJSONObject(body);   // technically: myJSON = (json)body;

            // get the message:
            string message = llJSONFindOne(myJSON, "message");
            llSay(PUBLIC_CHANNEL, message);

            // get the status (casting to integer before assigning):
            integer status = (integer) llJSONFindOne(myJSON, "status");
            llSay(PUBLIC_CHANNEL, "Status code was: " + (string) status);

            // get all selected fields as a list:
            list allUUIDs = llJSONFind(myJSON, "//data/*/uuid");    // note the XPath!
            // returns a list with [ a1f3f9f1-49da-4a5c-a719-f66a498245c2, 3c27cfdb-f5e9-4e9d-99f8-25f3ffd38fe2, 3060fa30-e59e-45ee-8746-455e4fde7d68 ]

            llSay(PUBLIC_CHANNEL, llList2CSV(allUUIDs));
            // Outputs (as expected):
            // a1f3f9f1-49da-4a5c-a719-f66a498245c2,3c27cfdb-f5e9-4e9d-99f8-25f3ffd38fe2,3060fa30-e59e-45ee-8746-455e4fde7d68

            // return only item names that are more recent than the timestamp provided:
            string timeStamp = llJSONFindOne(myJSON, "timestamp");
            // items only have YYYY-MM-DD, so we cut the timeStamp short:
            string justDate = llGetSubString(timeStamp, 0, 10);
            list recentItems = llJSONFind(myJSON, "//data/*/item,itemdate>" + justDate);
            // returns a strided list with:
            // [ "This is a different item altogether", "2020-03-12",  "I also have this one in inventory", "2020-02-20" ]
            // which could then be said in public chat, etc.

            // And clean up with:
            // note that this would release myJSON from being accessed via the JSON functions, but `body` would
            // still have the original data in string form — and could be re-accessed again
            // (by calling llJSONObject() again.
            llSay(PUBLIC_CHANNEL, "Error " + (string) status + " - please try later?");
            // add timer or something to try again later...

Note that I haven't really worried much about actually constructing a JSON object (for a reply). In my experience, it's almost always to trivially assemble such objects from scratch — again, simply because LSL does not support creating new types. As such, whatever data we have to send offworld will be kept in "primitive" data types (and possibly in lists), all of which are quite easily cast into strings that can be inserted into a JSON scaffolding. There is really no point in doing anything more complex than that. Not before LSL supports user-defined types, that is (which will probably never happen).

Ironically, LL's approach seems to be more worried with 'strict compliance' to the JSON standards in terms of creating corresponding structures/types in LSL that are equivalent to JSON's own object types. But isn't all that simply unnecessary overhead? I mean, since, as said, we cannot really create new types in LSL, so what would be the point of marshalling/encoding JSON data into an "LSL pseudo-type" — implemented using lists — if you cannot access it anyway as a fully-implemented JSON struct/type?

I'd say all of that extra functionally is overkill, which just makes things slow (as reported here on the comments) — slower than manually parsing JSON — and unnecessarily complicates things. In fact, over the past two decades, what I have always done was to parse JSON server-side, and just feed a much simpler format to in-world scripts (possibly in CSV, or at least something which can be parsed with llParseString2List, or even with simple string manipulation...). I do welcome the concept of having JSON to be directly parseable inside LSL. but... not at the cost of making the whole script much more complicated to write, maintain, debug, and, especially, slower to execute.

My recommendation, therefore, would be to push all this effort back into the drawing board and simply start from scratch. This clearly isn't the correct way to implement things — not for LSL, at least.

Last but not least... 'my' approach even has the advantage that it could be easily extended to other popular formats, such as XML, YAML, or even possibly TOML. In my non-LSL programming activities (these days, mostly done in Go), I use a 'generic XPath' library, which is a top-level abstract representation of the object that has been transmitted over the wire; at a lower level, this might be any one of those protocols mentioned before. At the upper level, though, I don't really need to worry about how the data is actually represented, I just worry how it can be accessed. In the SL ecosystem, for instance, LL not only uses LLSD (a well-defined, structured format formally implemented over XML), but also JSON here and there (allegedly, some of which being even BJSON), plain text (for the REST API used to provide the few statistics they still allow us to access), among others. What this means is that LSL will allow us to access one of those — but we'll still have to rely on our own functions for all the others (assuming we really, really wish to do it in LSL...). Unless, that is, LL is planning to retire LLSD and move everything to JSON (mostly to save bytes during communication). I don't see that happening yet, though; XML die-hard fans argue (very correctly) that XML can be formally validated (using DTD, XSD, or other similar ways of describing XML...), even without writing any special tools (so long as you strictly stick to the rules, valid XML is valid XML, no matter what tool is validating it), while JSON cannot (which is true to a degree — there are some attempts to do some JSON formal validation using "JSON Schemes"), much less any of the other possibilities.

Therefore, I would actually have expected LL to implement XML support first (especially full LLSD support!), and JSON only later. But I guess that, these days, most APIs out there are quite happy using JSON (especially when they're intended to be consumed by JavaScript), and JSON has wider support overall (I'd claim that's true as of 2023).

Anyway, just my L$0.02. Or perhaps slightly more than 0.02. :) And sorry for the long ranting, and thanks for reading everything until the very end.

Gwyneth Llewelyn (talk) 08:34, 7 November 2023 (PST)