User:LepreKhaun Resident/Workaround4Escaped Chars within JsonText 0ld
Workaround for Escaped Characters within Json Text ***Old***
NOTE: This is an archived page showing earlier thoughts to this problem and is kept only for reference to show how one can initially choose the wrong approach to a problem and worry it to death before the light bulb goes off. Please refer to this page for the final, much more elegant workaround.
[ETA: The reasoning here was correct, it was just the approach that was faulty, kept thinking it had to be kludged. :=)]
Because of the way LSL handles strings (we have no "raw strings", which are taken as written and not messed with), escape sequences such as \t
are interpreted as 4 spaces for us as soon as they are encountered. Trying to encode \t
by escaping the escape character (using \\t
) results in (incorrectly) placing \\t
within your Json text. Same for newlines, \n
.
And it's worse when you try to encode something like \"Stop!\" he shouted.
or She said \"No\"
. And UTF encoding such as \u7650
is perfectly valid within a Json text but is elusive to obtain using LSL strings.
Here's the only work around I've been able to work out. A kludge, granted, but at least it allows one to encode something like this:
<lsl> {
"A": "\b\f\t\r \n aba \u0000", "B": "\"he\"", "C": "\t"
}
string i = llEscapeURL("\\b\\f\\t\\r \\n aba \\u0000"); string j = llEscapeURL("\\\"he\\\"");
string jText;
jText = llList2Json(JSON_OBJECT, ["A", i, "B", j]); jText = llJsonSetValue(jText, ["C"], llEscapeURL("\\t"));
// {"A":"\b\f\t\r \n aba \u0000","B":"\"he\"","C":"\t"}
jText = llUnescapeURL(jText);
</lsl>
[ETA: 9/13/2013]
So, I was working the LSL string mis-handling of the escape character (\
) and, looking deeper into have not only found another workaround but made an exciting discovery- LSL does have "raw strings" of a sort, and they are the JSON_STRING! First the alternate workaround:
<lsl> // NOTE: Deprecated 9/19/2013 and replaced by // uList2Json() and uJsonSetValue()
// Global constants integer QUOTE = 0; // '\"' (Double Quote) integer SLOSH = 1; // '\\' (Reverse Solidus) integer SLASH = 2; // '\/' (Solidus) integer BP = 3; // '\b' (Break Point) integer FF = 4; // '\f' (Form Feed) integer NL = 5; // '\n' (New Line) integer CR = 6; // '\r' (Carriage Return) integer TAB = 7; // '\t' (Tab) integer U_ = 8; /* '\u' (Unicode Prefix- MUST immediately precede
a string of 4 Hex digits, 0-G, sans '0x') */
// Optional, included for completeness only integer CRLF = 9; // '\r\n' (Windows end-of-line)
////////////////////////////// // function string uList2JsonStringSafe (list jasonStringParts) // This function takes a list, jasonStringParts, // of the parts of the Json string one wishes and // returns a LSL string within double quotes ("") // with embedded escape characters within it that // correctly encodes as a Json string using either // llList2Json() or llJsonSetValue(). // // NOTE: Deprecated 9/19/2013 and replaced by // uList2Json() and uJsonSetValue() // Version 1.0 by LepreKhaun 9/9/2013 // May be freely used, modified and distributed with this header intact. // Compiled Size = 2,088 bytes /////////////////////////////// string uList2JsonStringSafe (list jasonStringParts) {
list escapeCodes = ["%5C%22", "%5C%5C", "%5C/", "%5Cb", "%5Cf", "%5Cn", "%5Cr", "%5Ct", "%5Cu", "%5Cr%5Cn"];
integer iter = llGetListLength(jasonStringParts);
// rString must be enclosed with escaped double quotes // to keep the LSL String "enhanced features" out of play string rString = "\"";
// build return string 'backwards' while (~--iter) { if(llGetListEntryType(jasonStringParts, iter) == TYPE_INTEGER) { // substitute encoding for integer constants rString = llList2String(escapeCodes, llList2Integer(jasonStringParts, iter)) + rString; } else { // escape String chunks to preserve them properly rString = llEscapeURL(llList2String(jasonStringParts, iter)) + rString; } } return llUnescapeURL("\"" + rString);
}
/////////// // Example encodings showing usage ///////////
default {
touch_end(integer i) { string jsonString; string jsonText;
// To encode '{"A":"\"Go!\" he yelled.\nShe replied \"No!\"","Z":"\\escaped \\ slosh\\"}' jsonString = uList2JsonStringSafe([QUOTE, "Go!", QUOTE, " he yelled.", NL, "She replied ", QUOTE, "No!", QUOTE]); jsonText = llList2Json(JSON_OBJECT, ["A", jsonString]); jsonString = uList2JsonStringSafe([SLOSH, "escaped ", SLOSH, " slosh", SLOSH]); jsonText = llJsonSetValue(jsonText, ["Z"], jsonString); llOwnerSay(jsonText);
// To encode '{"Control Chars":"\b\r\f\n\t and Windows uses \r\n for EOL","©":"\u00A9"}' jsonString = uList2JsonStringSafe([BP, CR, FF, NL, TAB, " and Windows uses ", CRLF, " for EOL"]); jsonText = llList2Json(JSON_OBJECT, ["Control Chars", jsonString]); jsonString = uList2JsonStringSafe([U_, "00A9"]); jsonText = llJsonSetValue(jsonText, ["©"], jsonString); llOwnerSay(jsonText);
// To encode '["WebSite","http:\/\/my.com\/ask.php?what%20is%20it","\t"]' jsonString = uList2JsonStringSafe(["http:", SLASH, SLASH, "my.com", SLASH, "ask.php?what%20is%20it"]); jsonText = llList2Json(JSON_ARRAY, ["WebSite", jsonString]); jsonText = llJsonSetValue(jsonText, [JSON_APPEND], uList2JsonStringSafe([TAB])); llOwnerSay(jsonText); }
}</lsl>
The how and why this approach works is based on an earlier observation I had made that Json text (LSL strings that were enclosed within '{}' or '[]') were being handled differently than other LSL strings in that their enclosed escape codes (such as \t
) were not being translated (to %09
or %20%20%20%20
), a "feature" LSL strings have.
I then noticed a difference in definitions between RFC 4627 and JSON.org. The RFC defines a Json text to be either an array or an object but at json.org it's defined as any Json Value, including the JSON_STRING. And a JSON_STRING is defined, of course, as being enclosed within double quotes (""). So I began experimenting with that type of LSL string and found the same exception to "enhanced features" was afforded!
But then another problem surfaced: The LSL functions llJsonGetValue() llJson2List() extracts a JSON_STRING as a regular LSL String, resulting in these escaped character sequences being "enhanced" by translation (in other words \t
becomes %09
, which is further "enhanced" to %20%20%20%20
when chatted and \u23B5
becomes u23B5
. Grrrrr.... This wasn't good for further processing, we needed a String to preserve these after the extraction.
And that lead to the development of uJsonGetValueSafe(), which returns the requested Value explicitly enclosed within double quotes {""}, just as it appears within the Json text...
And, of course, this was complicated by the RFC stating:
Insignificant whitespace is allowed before or after any of the six structural characters. ws = *( %x20 / ; Space %x09 / ; Horizontal tab %x0A / ; Line feed or New line %x0D ; Carriage return )
Hooboy!