Difference between revisions of "XyText-UTF8"

From Second Life Wiki
Jump to navigation Jump to search
Line 13: Line 13:


== A primer on UTF-8 encoding (for our needs with lsl and xytext handling) ==
== A primer on UTF-8 encoding (for our needs with lsl and xytext handling) ==
Just a quick view of the UTF-8 encoding used by SecondLife. It is a very complete and smart way of representing special characters.
With normal characters such as 'A', UTF-8 is perfectly equivalent to ASCII encoding and both will use 1 byte (hexadecimal 41). For special or International characters, then UTF-8 is offering a unique translation which can be more than one byte.
For instance if you take the char 'ç', this will translate (according to http://www.isthisthingon.org/unicode/index.php) in utf8 to: C3A7 (two bytes).
You can also know this writing the following simple lsl script:
<lsl>
default
{   
    touch_start(integer total_number)
    {
        llSay(0, llEscapeURL("ç"));
    }
}
</lsl>
which is giving in chat
<pre>
[15:20]  Object: %C3%A7
</pre>

Revision as of 14:21, 24 May 2009

Scripting tools to allow display of text on a prim: XyText 1.5 , XyzzyText, XyyyyzText, XyText-UTF8, XyzzyText-UTF8, ZZText

Introduction

After some years of using XyzzyText I eventually got convinced that Xyzzy is really slow compared to xytext solutions. Probably a bit less lag, but when used in lessons people clearly noticed the filling up of lines terribly slow. So I backported my utf8 experience to XyText.

Also I noticed that whenever I needed to adapt xyXXXX to a new character set or encoding or if wanting to change the parameters for slicing, gridding, offsetting and finding the UTF8 counterpart in the internal lists was really difficult.

This is why here I'm proposing new parametrized CELL using 5 chars able to be driven by external main script.

A primer on UTF-8 encoding (for our needs with lsl and xytext handling)

Just a quick view of the UTF-8 encoding used by SecondLife. It is a very complete and smart way of representing special characters.

With normal characters such as 'A', UTF-8 is perfectly equivalent to ASCII encoding and both will use 1 byte (hexadecimal 41). For special or International characters, then UTF-8 is offering a unique translation which can be more than one byte.

For instance if you take the char 'ç', this will translate (according to http://www.isthisthingon.org/unicode/index.php) in utf8 to: C3A7 (two bytes).

You can also know this writing the following simple lsl script:

<lsl> default {

   touch_start(integer total_number)
   {
       llSay(0, llEscapeURL("ç"));
   }

}

</lsl> which is giving in chat

[15:20]  Object: %C3%A7