Difference between revisions of "User:LepreKhaun Resident/Workaround4Escaped Chars within JsonText"

From Second Life Wiki
Jump to: navigation, search
(Added User Function uList2JsonStringSafe and observations)
((Made while condition more readable))
 
(10 intermediate revisions by 3 users not shown)
Line 1: Line 1:
===Workaround for Escaped Characters within Json Text===
+
['''NOTE:''' Pages within my Name Space are a WIP and constantly changing. As my understanding of the problems I attempt to address and the grasp of the subject matter itself deepens, I regularly review what I have written and update the content as better algorithms occur to me.
  
Because of the way LSL handles strings (we have no "raw strings", which are taken as written and not messed with), escape sequences such as "\t" are interpreted as 4 spaces for us as soon as they are encountered. Trying to encode "\t" by escaping the escape character (using "\\t") results in (incorrectly) placing '\\t' within your Json text. Same for newlines, "\n".
+
However, for this process of refinement, improvement and tweaking to result in something that might (hopefully!) benefit the community at large, I ask that comments, suggested improvements, corrections of fact or your own personal style preferences be made ONLY on the Discussion Pages within my Name Space. Thank you!]
  
And it's worse when you try to encode something like "\"Stop!\" he shouted." or "She said \"No\"". And UTF encoding such as "\u7650" is perfectly valid within a Json text but is elusive to obtain using LSL strings.
+
=== uList2Json() and uJsonSetValue()===
  
Here's the only work around I've been able to work out. A kludge, granted, but at least it allows one to encode something like this:
+
As many of you may be aware, LSL has the habit of "enhancing" Strings. This is regarded as a "feature" of the language and usually works out for the best, giving one the option of formatting chatted text by using "\t" and "\n". Unfortunately, one didn't have a way to opt out of this behavior. Put in computereze, LSL simply lacked "raw strings".
  
<pre>{
+
This has bedeviled those working with Json text, either for web communications or developing other uses for it, because some strings just wouldn't encode properly. That is to say, these are all perfectly valid Json strings that simply couldn't be directly formed with llList2Json() and llJsonSetValue():
"A": "\b\f\t\r \n aba \u0000",
+
*"\"Go!\" he yelled.\n"
"B": "\"he\"",
+
*"She replied \"No!\""
"C": "\t"
+
*"Copyright symbol is \u00A9"
}
+
*"oops]"
 +
*"Control characters are \t\n\r\f\b"
  
        string jText;
+
I've spent a few weeks studying the problem, [[User:LepreKhaun_Resident/Workaround4Escaped_Chars_within_JsonText_0ld|most of it going about it the wrong way]], but had an epiphany. A one line addition Maestro Linden added to [[Json_usage_in_LSL|Json Usage in LSL]] on the 10th ("LSL strings which both begin and end with "\"" are interpreted literally as JSON strings, while those without are parsed when converted into JSON.") confirmed what I had begun to surmise- a Json String (being a LSL String that is further enclosed within double quotes) is a "raw string"! Once I had that in hand, the following two functions practically wrote themselves.
        string i = llEscapeURL("\\b\\f\\t\\r \\n aba \\u0000");
+
        string j = llEscapeURL("\\\"he\\\"");
+
       
+
        jText = llList2Json(JSON_OBJECT, ["A", i, "B", j]);
+
        jText = llJsonSetValue(jText, ["C"], llEscapeURL("\\t"));
+
        jText = llUnescapeURL(jText);</pre>
+
  
'''jText == {"A":"\b\f\t\r \n aba \u0000","B":"\"he\"","C":"\t"}'''
 
  
----
+
<div><lsl>//////////////////////////////
[ETA: 9/13/2013]
+
// function string uList2Json (string type, list values)
 +
// This function takes the exact same parameters as
 +
// llList2Json() but correctly encodes all possible strings
 +
// including those with escape characters within them.
 +
//
 +
// Initial strings must escape all instances of the
 +
// desired escape character itself
 +
// (ie "\\t" => '\t', "\\\\" => '\\', "\\/" => '\/')
 +
// as well as any double quotes ("\\\"" => '\"')
 +
//
 +
// Version 1.0 by LepreKhaun 9/19/2013
 +
// May be freely used, modified and distributed with this header intact.
 +
///////////////////////////////
 +
string uList2Json (string type, list values)
 +
{
  
So, I was working the LSL string mis-handling of the escape character ("\") and, looking deeper into have not only found another workaround but made an exciting discovery- LSL does have "raw strings" of a sort, and they are the JSON_STRING! First the alternate workaround:
+
integer iter = -1;
 +
integer listLength = llGetListLength(values);
 +
 +
// Step through list, hitting every other item if JSON_OBJECT
 +
integer step = 1 + (type == JSON_OBJECT);
 +
while ((iter += step) < listLength)
 +
// necessary so we don't choke on next if test
 +
if (llGetListEntryType(values, iter) == TYPE_STRING)
 +
// make sure it is not a JSON_* Value or a Number
 +
if (llJsonValueType(llList2String(values, iter), []) == JSON_INVALID)
 +
values = llListReplaceList(values, ["\"" + llList2String(values, iter) + "\""], iter, iter);
  
<lsl>// Global constants
+
return llList2Json(type, values);
integer QUOTE = 0; // '\"' (Double Quote)
+
}
integer SLOSH = 1; // '\\' (Reverse Solidus)
+
integer SLASH = 2; // '\/' (Solidus)
+
integer BP = 3; // '\b' (Break Point)
+
integer FF = 4; // '\f' (Form Feed)
+
integer NL = 5; // '\n' (New Line)
+
integer CR = 6; // '\r' (Carriage Return)
+
integer TAB = 7; // '\t' (Tab)
+
integer U_ = 8; /* '\u' (Unicode Prefix- MUST immediately precede
+
  a string of 4 Hex digits, 0-G, sans '0x') */
+
// Optional, included for completeness only
+
integer CRLF = 9; // '\r\n' (Windows end-of-line)
+
  
 
//////////////////////////////
 
//////////////////////////////
// function string uList2JsonStringSafe (list jasonStringParts)
+
// function string uJsonSetValue ( string json, list specifiers, string value )
// This function takes a list, jasonStringParts,
+
// This function takes the exact same parameters as
// of the parts of the Json string one wishes and
+
// llJsonSetValue() but correctly encodes all possible strings
// returns a LSL string within double quotes ("")
+
// including those with escape characters within them.
// with embedded escape characters within it that
+
// correctly encodes as a Json string using either
+
// llList2Json() or llJsonSetValue().
+
 
//
 
//
// Version 1.0 by LepreKhaun 9/9/2013
+
// Initial strings must escape all instances of the
 +
// desired escape character itself
 +
// (ie "\\t" => '\t', "\\\\" => '\\', "\\/" => '\/')
 +
// as well as any double quotes ("\\\"" => '\"')
 +
//
 +
// NOTE: To encode a Float or Integer as a String
 +
// within the Json text, enclose it with escaped quotes
 +
// (ie '"3"' => '3' BUT '"\"3\""' => '"3"')
 +
//
 +
// Version 1.0 by LepreKhaun 9/19/2013
 
// May be freely used, modified and distributed with this header intact.
 
// May be freely used, modified and distributed with this header intact.
// Compiled Size = 2,088 bytes
 
 
///////////////////////////////
 
///////////////////////////////
string uList2JsonStringSafe (list jasonStringParts)
+
string uJsonSetValue(string json, list specifiers, string value)
 
{
 
{
list escapeCodes = ["%5C%22", "%5C%5C", "%5C/", "%5Cb", "%5Cf", "%5Cn", "%5Cr", "%5Ct", "%5Cu", "%5Cr%5Cn"];
+
// We don't want to change the string representation of  
 
+
// an integer, a float or any Json Value Type
integer iter = llGetListLength(jasonStringParts);
+
if (llJsonValueType(value, []) == JSON_INVALID)
+
value = "\"" + value + "\"";
// rString must be enclosed with escaped double quotes
+
return llJsonSetValue(json, specifiers, value);
// to keep the LSL String "enhanced features" out of play
+
string rString = "\"";
+
+
// build return string 'backwards'
+
while (~--iter)
+
{
+
if(llGetListEntryType(jasonStringParts, iter) == TYPE_INTEGER)
+
{
+
// substitute encoding for integer constants
+
rString = llList2String(escapeCodes, llList2Integer(jasonStringParts, iter)) + rString;
+
}
+
else
+
{
+
// escape String chunks to preserve them properly
+
rString = llEscapeURL(llList2String(jasonStringParts, iter)) + rString;
+
}
+
}
+
return llUnescapeURL("\"" + rString);
+
 
}
 
}
 +
  
 
///////////
 
///////////
// Example encodings showing usage
+
// Examples showing usage
 
///////////
 
///////////
 
+
 
default
 
default
 
{
 
{
 
touch_end(integer i)
 
touch_end(integer i)
 
{
 
{
string jsonString;
+
string temp;
 
string jsonText;
 
string jsonText;
 
 
 
// To encode '{"A":"\"Go!\" he yelled.\nShe replied \"No!\"","Z":"\\escaped \\ slosh\\"}'
 
// To encode '{"A":"\"Go!\" he yelled.\nShe replied \"No!\"","Z":"\\escaped \\ slosh\\"}'
jsonString = uList2JsonStringSafe([QUOTE, "Go!", QUOTE, " he yelled.", NL, "She replied ", QUOTE, "No!", QUOTE]);
+
jsonText = uList2Json (JSON_OBJECT, [
jsonText = llList2Json(JSON_OBJECT, ["A", jsonString]);
+
"A", "\\\"Go!\\\" he yelled.\\nShe replied \\\"No!\\\"",  
jsonString = uList2JsonStringSafe([SLOSH, "escaped ", SLOSH, " slosh", SLOSH]);
+
"Z", "\\\\escaped \\\\ slosh\\\\"         //"//wiki syntax highlighter kludge
jsonText = llJsonSetValue(jsonText, ["Z"], jsonString);
+
]);
 
llOwnerSay(jsonText);
 
llOwnerSay(jsonText);
 
 
 
// To encode '{"Control Chars":"\b\r\f\n\t and Windows uses \r\n for EOL","©":"\u00A9"}'
 
// To encode '{"Control Chars":"\b\r\f\n\t and Windows uses \r\n for EOL","©":"\u00A9"}'
jsonString = uList2JsonStringSafe([BP, CR, FF, NL, TAB, " and Windows uses ", CRLF, " for EOL"]);
+
jsonText = uList2Json(JSON_OBJECT, [
jsonText = llList2Json(JSON_OBJECT, ["Control Chars", jsonString]);
+
"Control Chars", "\\b\\r\\f\\n\\t and Windows uses \\r\\n for EOL",
jsonString = uList2JsonStringSafe([U_, "00A9"]);
+
"©", "\\u00A9"
jsonText = llJsonSetValue(jsonText, ["©"], jsonString);
+
]);
 
llOwnerSay(jsonText);
 
llOwnerSay(jsonText);
 
 
 
// To encode '["WebSite","http:\/\/my.com\/ask.php?what%20is%20it","\t"]'
 
// To encode '["WebSite","http:\/\/my.com\/ask.php?what%20is%20it","\t"]'
jsonString = uList2JsonStringSafe(["http:", SLASH, SLASH, "my.com", SLASH, "ask.php?what%20is%20it"]);
+
jsonText = uList2Json(JSON_ARRAY, [
jsonText = llList2Json(JSON_ARRAY, ["WebSite", jsonString]);
+
"WebSite",
jsonText = llJsonSetValue(jsonText, [JSON_APPEND], uList2JsonStringSafe([TAB]));
+
"http:\\/\\/my.com\\/ask.php?what%20is%20it",
 +
"\\t"
 +
]);
 
llOwnerSay(jsonText);
 
llOwnerSay(jsonText);
 +
 +
// Make a Json object...
 +
temp = uList2Json(JSON_OBJECT, [
 +
"A", 99,
 +
"Z", "88]",
 +
"C", JSON_TRUE
 +
]);
 +
// ... add it to end of the array ...
 +
jsonText = uJsonSetValue(jsonText, [JSON_APPEND], temp);
 +
// ... change our web address ...
 +
jsonText = uJsonSetValue(jsonText, [1], "http:\\/\\/www.google.com");
 +
// ... change that TAB in the third spot to PI
 +
jsonText = uJsonSetValue(jsonText, [2], (string)PI);
 +
// ... and add a new "Key":Value pair to our object
 +
jsonText = uJsonSetValue(jsonText, [3, "New"], ((string)PI + "\\n"));
 +
 +
//  ["WebSite","http:\/\/www.google.com",3.141593,{"A":99,"C":true,"New":"3.141593\n","Z":"88]"}]
 +
llOwnerSay(jsonText);
 +
 
 
}
 
}
}</lsl>
+
}</lsl></div>
 
+
The how and why this approach works is based on an earlier observation I had made that Json text (LSL strings that were enclosed within '{}' or '[]') were being handled differently than other LSL strings in that their enclosed escape codes (such as '\t') were not being translated (to '%09" or '%20%20%20%20'), a "feature" LSL strings have.
+
 
+
I then noticed a difference in definitions between [http://tools.ietf.org/html/rfc4627 RFC 4627] and [http://www.json.org/ JSON.org]. The RFC defines a Json text to be either an array or an object but at json.org it's defined as any Json Value, including the JSON_STRING. And a JSON_STRING is defined, of course, as being enclosed within double quotes (""). So I began experimenting with that type of LSL string and found the same exception to "enhanced features" was afforded!
+
 
+
But then another problem surfaced: The LSL functions llJsonGetValue() llJson2List() extracts a JSON_STRING as a regular LSL String, resulting in these escaped character sequences being "enhanced" by translation (in other words '\t' becomes '%09', which is further "enhanced" to '%20%20%20%20' when chatted and '\u23B5' becomes 'u23B5'. Grrrrr.... This wasn't good for further processing, we needed a String to preserve these after the extraction.
+
 
+
And that lead to the development of [[LepreKhaun_Resident/Json_Get_Value_Safe|uJsonGetValueSafe()]], which returns the requested Value explicitly enclosed within double quotes {""}, just as it appears within the Json text...
+
 
+
And, of course, this was complicated by the RFC stating:
+
<pre> Insignificant whitespace is allowed before or after any of the six
+
  structural characters.
+
 
+
  ws = *(
+
%x20 /   ; Space
+
%x09 /   ; Horizontal tab
+
%x0A /   ; Line feed or New line
+
%x0D ; Carriage return
+
</pre> )
+
Hooboy!
+
 
+
  
 +
Now, if I can just get the retrieval worked out as simply... ;=)
 
----
 
----
  
 
<center>== [[User:LepreKhaun_Resident|'''More Json Tips, Tricks and Coding Examples''']] ==</center>
 
<center>== [[User:LepreKhaun_Resident|'''More Json Tips, Tricks and Coding Examples''']] ==</center>

Latest revision as of 19:44, 15 October 2013

[NOTE: Pages within my Name Space are a WIP and constantly changing. As my understanding of the problems I attempt to address and the grasp of the subject matter itself deepens, I regularly review what I have written and update the content as better algorithms occur to me.

However, for this process of refinement, improvement and tweaking to result in something that might (hopefully!) benefit the community at large, I ask that comments, suggested improvements, corrections of fact or your own personal style preferences be made ONLY on the Discussion Pages within my Name Space. Thank you!]

uList2Json() and uJsonSetValue()

As many of you may be aware, LSL has the habit of "enhancing" Strings. This is regarded as a "feature" of the language and usually works out for the best, giving one the option of formatting chatted text by using "\t" and "\n". Unfortunately, one didn't have a way to opt out of this behavior. Put in computereze, LSL simply lacked "raw strings".

This has bedeviled those working with Json text, either for web communications or developing other uses for it, because some strings just wouldn't encode properly. That is to say, these are all perfectly valid Json strings that simply couldn't be directly formed with llList2Json() and llJsonSetValue():

  • "\"Go!\" he yelled.\n"
  • "She replied \"No!\""
  • "Copyright symbol is \u00A9"
  • "oops]"
  • "Control characters are \t\n\r\f\b"

I've spent a few weeks studying the problem, most of it going about it the wrong way, but had an epiphany. A one line addition Maestro Linden added to Json Usage in LSL on the 10th ("LSL strings which both begin and end with "\"" are interpreted literally as JSON strings, while those without are parsed when converted into JSON.") confirmed what I had begun to surmise- a Json String (being a LSL String that is further enclosed within double quotes) is a "raw string"! Once I had that in hand, the following two functions practically wrote themselves.


<lsl>//////////////////////////////

// function string uList2Json (string type, list values) // This function takes the exact same parameters as // llList2Json() but correctly encodes all possible strings // including those with escape characters within them. // // Initial strings must escape all instances of the // desired escape character itself // (ie "\\t" => '\t', "\\\\" => '\\', "\\/" => '\/') // as well as any double quotes ("\\\"" => '\"') // // Version 1.0 by LepreKhaun 9/19/2013 // May be freely used, modified and distributed with this header intact. /////////////////////////////// string uList2Json (string type, list values) {

integer iter = -1; integer listLength = llGetListLength(values);

// Step through list, hitting every other item if JSON_OBJECT integer step = 1 + (type == JSON_OBJECT); while ((iter += step) < listLength) // necessary so we don't choke on next if test if (llGetListEntryType(values, iter) == TYPE_STRING) // make sure it is not a JSON_* Value or a Number if (llJsonValueType(llList2String(values, iter), []) == JSON_INVALID) values = llListReplaceList(values, ["\"" + llList2String(values, iter) + "\""], iter, iter);

return llList2Json(type, values); }

////////////////////////////// // function string uJsonSetValue ( string json, list specifiers, string value ) // This function takes the exact same parameters as // llJsonSetValue() but correctly encodes all possible strings // including those with escape characters within them. // // Initial strings must escape all instances of the // desired escape character itself // (ie "\\t" => '\t', "\\\\" => '\\', "\\/" => '\/') // as well as any double quotes ("\\\"" => '\"') // // NOTE: To encode a Float or Integer as a String // within the Json text, enclose it with escaped quotes // (ie '"3"' => '3' BUT '"\"3\""' => '"3"') // // Version 1.0 by LepreKhaun 9/19/2013 // May be freely used, modified and distributed with this header intact. /////////////////////////////// string uJsonSetValue(string json, list specifiers, string value) { // We don't want to change the string representation of // an integer, a float or any Json Value Type if (llJsonValueType(value, []) == JSON_INVALID) value = "\"" + value + "\""; return llJsonSetValue(json, specifiers, value); }


/////////// // Examples showing usage ///////////

default { touch_end(integer i) { string temp; string jsonText;

// To encode '{"A":"\"Go!\" he yelled.\nShe replied \"No!\"","Z":"\\escaped \\ slosh\\"}' jsonText = uList2Json (JSON_OBJECT, [ "A", "\\\"Go!\\\" he yelled.\\nShe replied \\\"No!\\\"", "Z", "\\\\escaped \\\\ slosh\\\\" //"//wiki syntax highlighter kludge ]); llOwnerSay(jsonText);

// To encode '{"Control Chars":"\b\r\f\n\t and Windows uses \r\n for EOL","©":"\u00A9"}' jsonText = uList2Json(JSON_OBJECT, [ "Control Chars", "\\b\\r\\f\\n\\t and Windows uses \\r\\n for EOL", "©", "\\u00A9" ]); llOwnerSay(jsonText);

// To encode '["WebSite","http:\/\/my.com\/ask.php?what%20is%20it","\t"]' jsonText = uList2Json(JSON_ARRAY, [ "WebSite", "http:\\/\\/my.com\\/ask.php?what%20is%20it", "\\t" ]); llOwnerSay(jsonText);

// Make a Json object... temp = uList2Json(JSON_OBJECT, [ "A", 99, "Z", "88]", "C", JSON_TRUE ]); // ... add it to end of the array ... jsonText = uJsonSetValue(jsonText, [JSON_APPEND], temp); // ... change our web address ... jsonText = uJsonSetValue(jsonText, [1], "http:\\/\\/www.google.com"); // ... change that TAB in the third spot to PI jsonText = uJsonSetValue(jsonText, [2], (string)PI); // ... and add a new "Key":Value pair to our object jsonText = uJsonSetValue(jsonText, [3, "New"], ((string)PI + "\\n"));

// ["WebSite","http:\/\/www.google.com",3.141593,{"A":99,"C":true,"New":"3.141593\n","Z":"88]"}] llOwnerSay(jsonText);

}

}</lsl>

Now, if I can just get the retrieval worked out as simply... ;=)


== More Json Tips, Tricks and Coding Examples ==