UTF-8 Test

From Second Life Wiki
Jump to navigation Jump to search

Tests taken from http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt Special thanks to Strife Onizuka and Kelly Linden for wrangling them into LSL.

1) Make a script from the following script and a notecard from the following notecard and put them both into a box. Name the box "utf-8 test".

If you have a unicode font installed, you should see unicode characters, eg.

  • utf-8 test: Greek kosme
  • utf-8 test: κόσμε

If you do not have a unicode font installed, you should see something like boxes, eg.

  • utf-8 test: Greek kosme
  • utf-8 test: ■■■μ■

2) Let the script finish. It should report that all tests passed.

The Script follows

string card;
integer lines = -1;
integer line = 0;
list resuts;
integer pass;
integer fail;

                card = llGetInventoryName(INVENTORY_NOTECARD,0));
            llOwnerSay("Test-Starting: "+card);
    on_rez(integer a)
    touch_start(integer a)
    changed(integer a)
        if(a & CHANGED_INVENTORY)
    dataserver(key a, string b)
        if(lines == -1)
            lines = (integer)b;
            list c = llParseString2List(a=(string)llParseString2List(b,[" "],[]),["|"],[EOF]);
            integer d;
            integer e;
            integer f;
            if(llGetSubString(a,0,0) == "#")
            else if(llGetListLength(c) >= 2)
                d = llStringLength(b = llUnescapeURL(llList2String(c,0)));
                pass += e = (d == f = llList2Integer(c,1));
                fail += !e;
                string out = (string)line +": ";
                    out += llList2String(["Fail","Pass"],e) + " ";
                    out += "(" + (string)d + " - " + (string)f + ") ";
//Enable this section to test llUnescapeURL
//                       out += "(" + (string)((
//                                llStringLength(
//                                    (string)llParseString2List( //strips off the evil pad
//                                        llStringToBase64(b),["="],[]
//                                    )
//                                ) * 3 ) / 4); //thats how many bytes should be in it (assuming all escaped)
//                       out += " - ";
//                       out += (string)(llStringLength(llList2String(c,0))/3) + ")";
                // This will IM all the tests to the owner!  This is slow because IM sleeps the script for 2 seconds.
            if(llListFindList(c,[EOF]) == -1 && ++line < lines)
                llOwnerSay("Passed: "+(string)pass);
                llOwnerSay("Failed: "+(string)fail);

The Notecard follows

#Greek kosme

#Boundary condition test cases - First possible sequence of a certain length

#Last Possible Sequence of a certain length

#Other boundary conditions

#Unexpected continuation bytes

#All 64 possible continuation bytes

#All 32 first bytes of 2-byte sequences

#All 16 first bytes of 3-byte sequences

#All 8 first bytes of 4-byte sequences

#All 4 first bytes of 5-byte sequences

#All 2 first bytes of 6-byte sequences

#Sequences with last continuation byte missing

#Impossible bytes

#Examples of an overlong ASCII character

#Maximum overlong sequences

#Overlong sequences - Overlong representation of the NUL character

#Illegal code positions - Single UTF-16 surrogates

#Illegal code positions - Paired UTF-16 surrogates

#Illegal code positions - Other illegal code positions