User:Zero Linden/Office Hours/2008 Apr 17
Jump to navigation
Jump to search
- [8:30] Wyn Galbraith: has seen the black helios
- [8:30] Arawn Spitteler: FEMA Heliocopters are only the tip of the iceberg
- [8:30] Wyn Galbraith: What about the training of terrorists
- [8:30] Minos Deerhunter: hello
- [8:30] Wyn Galbraith: I couldn't believe that when I heard it on the congress tape.
- [8:30] Arawn Spitteler: Civilization has always been a ballance of Community and Integrity, and the Rich are the Community Leaders.
- [8:31] Arawn Spitteler: The Rich have certainly always been in charge, of the traiining of terrorists.
- [8:31] Tree Kyomoon: the only hope for the poor is theres so damn many of us :)
- [8:31] Wyn Galbraith: LOL
- [8:32] Arawn Spitteler: What was on the Congress Tape? Was this Phillip's Presentation?
- [8:32] Wyn Galbraith: waves at Zha
- [8:33] Arawn Spitteler: whispers: Is that Torley?
- [8:33] Tree Kyomoon: greetings madame zha
- [8:33] Arawn Spitteler: Naw, it's just Giorno, dressed as Torley.
- [8:33] Zha Ewry: LOL
- [8:33] Rex Cronon: hi everybody
- [8:34] Rex Cronon: what is with this physics lag?
- [8:34] Arawn Spitteler: When Dazzle becomes mainstream, I'd like to see the Torley Skin, but not as a constant diet.
- [8:34] Tree Kyomoon: Ive long hoped for a physics lag in RL.
- [8:35] Xugu Madison: Very... Wily E. Coyote
- [8:35] Arawn Spitteler: has a physics lag, but it interfere's with the Cultural Interface, and the paying of the Electric Bill.
- [8:35] Tree Kyomoon: indeed
- [8:35] Arawn Spitteler: On hte Meat Side, your sim has to interact with your neighbor's sim, and slower sims don't belong where the Money is.
- [8:36] Wyn Galbraith: LOL, we had that conversation last night, the Coyote RoadRunner thing
- [8:36] Tree Kyomoon: here comes mr zero
- [8:36] Zha Ewry: beeps
- [8:37] Wyn Galbraith: Morning Zero (meeps)
- [8:37] Tree Kyomoon: its the zero effect
- [8:37] Arawn Spitteler: welcomes Zero to his office hour: What agenda were we just working on? Yes, the impact of SL on the Agent Domain...
- [8:37] Zha Ewry: nabs zero's coffee mug, as it goes by, and takes a quickhit
- [8:37] Wyn Galbraith: Zero is wearing the ever so pleasant grey.
- [8:38] Zero Linden: is fighting his dazzle camera
- [8:38] Morgaine Dinova: 'Morning Zero
- [8:38] Zero Linden: welcome all
- [8:38] Zha Ewry: Bouncing about is it Zero?
- [8:38] Zero Linden: sorry I'm a bit late
- [8:38] Zero Linden: so - there is an agenda already?
- [8:38] Zero Linden: lay it on me
- [8:38] Tree Kyomoon: we will recalibrate the time space continuum for you
- [8:38] Morgaine Dinova: I missed last Tues (no transcript?), so can't suggest any agenda.
- [8:39] Wyn Galbraith: Coyote or Road Runner? To be honest I think Sai has the agenda.
- [8:39] Tree Kyomoon: sorry about that, I was out last week
- [8:39] Arawn Spitteler: We were just discussing Physics Lag on the Meat Side. It occurs to me, that such parrallels could be of benefit in this discussion.
- [8:40] Zero Linden: well - I have one item: Unicode
- [8:40] Arawn Spitteler's: style of Brainstorming cherishes the fertility of Bull Shit: What were you planning?
- [8:40] Tree Kyomoon: well Im going to try to put http request cookies on the ole agenda again as my ole standby
- [8:41] Arawn Spitteler: Unicode, is that the words we use for Characters?
- [8:41] Zero Linden: That, tree, is more of a current LSL feature request than an architectural item....
- [8:41] Zha Ewry: 8IN
- [8:41] Zero Linden: Tree, is there a pjira entry for that
- [8:41] Zha Ewry: I*N
- [8:41] Tree Kyomoon: yes
- [8:42] Tree Kyomoon: I mean, a general webservices model would be good enough
- [8:43] Arawn Spitteler: would like Client-Side HUD, and wonders if a discussion of that would belong in an Architectural Primer.
- [8:43] Tree Kyomoon: I just want full access to my island from my webserver
- [8:43] Zero Linden: AHA - found the workaround for Mac/Mightmouse/Flycam problem
- [8:43] Zha Ewry: Oh?
- [8:44] Zha Ewry: Dish
- [8:44] Morgaine Dinova: Selling your Mac? ;-))
- [8:44] Xugu Madison: I'm wondering if anyone has an idea of timescale for islands hosted outside LL, similar to the IBM setup, being available...
- [8:44] Arawn Spitteler: Is that the Camera Smoothing Problem?
- [8:44] Zero Linden: Preferences > Input & Camera > Joystick Setup > Enable Joystick checkbox set to off
- [8:44] Tree Kyomoon: yes, Zha if you have an update or info on IBM's plans
- [8:45] Zero Linden: For some reason it thinks the MightyMouse is a joystick
- [8:45] Tree Kyomoon: that would be good, i'll store them in some droids and pass them to obiwan
- [8:45] Zero Linden: Okay - welllllllll
- [8:45] Zero Linden: I'm going to lead with Unicode
- [8:46] Zero Linden: I've been working internally on a project to draw up some Unicode guidelines for the company
- [8:46] Zero Linden: and it brought up some issues that I wondered if anyone here had
- [8:46] Zero Linden: experience with
- [8:47] Zero Linden: Turns out, not all big, popular, open source software is truly Unicode compliant
- [8:47] Zero Linden: Like MySQL 5 and PHP 5
- [8:47] Zero Linden: Feh!
- [8:47] Xugu Madison: MySQL isn't unicode compliant? Never had a problem here...
- [8:47] Zero Linden: In particular, they don't support characters above the Basic Multilingual Plane (that is U+0000 through U+FFFF)
- [8:47] Leffard Lassard: And mono/.net either. It only supports 2byte characters.
- [8:47] Arawn Spitteler: Are these still using Latin1?
- [8:47] Zha Ewry: could offer you a nice industrial Database product.
- [8:48] Zero Linden: Arawn - they both have UTF8 encoding modes, but only support 16 bit characters
- [8:48] Zero Linden: There are various ways of tricking them.... and SL does trick them
- [8:48] Zero Linden: but we sometimes get caught
- [8:48] Tree Kyomoon: isnt that UTF 16?
- [8:48] Zero Linden: No, Tree
- [8:49] Zero Linden: UTF-8 is an encoding scheme that encodes Unicode code points in between 1 and 4 bytes
- [8:49] Zero Linden: code points that fit within 16 bits sometimes still take 3 byts in UTF-8
- [8:49] Zha Ewry: The 3 and 4 byte code points are painful for a lot of people
- [8:49] Leffard Lassard: Btw. what is the content above the basic multilingual plane?
- [8:49] Tree Kyomoon: so ASCII mabey is all folks cared about
- [8:49] Zero Linden: Leffard - do you know if Mono/.net is only supporting 16 bit characters, or is it (like Java pre 1.5) UTF-16,
- [8:50] Zero Linden: that is supports the extended characters via UTF-16 surrogate pairs?
- [8:50] Zero Linden: Leffard - well Plane 1 has things like Cuneiform and Linear B
- [8:50] Leffard Lassard: So, I hacked recently mono and I believe the mono documentation mentions only 16bit characters as the character type.
- [8:50] Zero Linden: which woudl be fun to chat in, but no great loss (no one has the fonts, anyway)
- [8:50] Tao Takashi: Hi
- [8:51] Zero Linden: But Plane 2 has a huge list of Chinese compatibility characters
- [8:51] Leffard Lassard: I see. So chinese people are sol.
- [8:51] Tree Kyomoon: wouldnt that introduce a huge amount of overhead as well? its like 20MB in font size
- [8:51] Zero Linden: to implement lossless round trip against China's GB18030 character set
- [8:52] Zero Linden: No, actually most Chinese, I think, is entered using the complete character set that is already
- [8:52] Zero Linden: within Plane 0
- [8:52] Arawn Spitteler: What's a Plane?
- [8:52] Zero Linden: Oh
- [8:52] Zero Linden: Unicode consists of 17 planes of 65,536 code points
- [8:52] Zero Linden: (yes, 17, not 16)
- [8:52] Tree Kyomoon: in online training we use a subset of the Chinese character sets
- [8:53] Morgaine Dinova: What parts of infrastructure are affected by internationalization? (Excluding client)
- [8:53] Zero Linden: Plane 0, has code points U+0000 U+FFFF and covers almost everything you are ever likely to actually see
- [8:53] Zero Linden: Notice that it fits in 16 bits
- [8:53] Arawn Spitteler: Including the 6,000 characters of Chinese.
- [8:54] Zero Linden: Plane 2 has code points U+20000 through U+2FFFF and has these compatibility characters
- [8:54] Tree Kyomoon: well in Traditional theres something like 18000
- [8:54] Zero Linden: Morgaine - well, Unicode compliance is a subtle thing, actually
- [8:54] Morgaine Dinova: IM is probably 8-bit clean, so I assume it remains clean regardless of char depth.
- [8:54] Zero Linden: for us I think it means that we have to agree on certain expectations of compliance
- [8:55] Zero Linden: for example - an easy one is this: "Human readable text and names are entered, stored, generated and displayed as Unicode strings."
- [8:55] Tree Kyomoon: aha there it is, "simplified chinese level 1 13741 glyphs"
- [8:55] Zero Linden: Which implies that, no, you can't store arbitrary binary data in a parcel description
- [8:55] Zha Ewry: And then ypou get to define, exactly which level of unicode
- [8:56] Morgaine Dinova: Let's leave the client out of this though. "What's readable" from the client perspective doesn't affect most of the infrastructure.
- [8:56] Zero Linden: Well, so long as you are compliant with 3.1 or later, you are basically future-compatible there
- [8:57] Zero Linden: Well, it does, for example, in how one encodes such data
- [8:57] Leffard Lassard: The problem is not being 8bit-clean. The problem is what does a language/runtime interpret as a character. And for instance mono interprets at most 16bit as a char.
- [8:57] Tree Kyomoon: could you support the characters but make them a separate download like MS does?
- [8:57] Zero Linden: so, for example, we can put the parcel name in the content of a <string>...</string> XML element
- [8:57] Zero Linden: becuase of the above agreement
- [8:58] Zero Linden: Tree - I'm not even worried about the ability of a computers font machinery to display them
- [8:58] Zero Linden: the problem is that if you store them in your back end database and then retrieve them
- [8:58] Tree Kyomoon: ahhh you need to store high bit parcel names etc. Right gotcha
- [8:58] Zero Linden: we have to agree if it is "required", "strrongy suggested" or just "optional" that you retain them
- [8:58] Umeko Kawanishi: zero--my first to your office hour, do you usually have an agenda or anybody can ask a question.
- [8:59] Zero Linden: Umeko - I asked for agenda up front
- [8:59] Umeko Kawanishi: oh i see. so i was late
- [8:59] Zero Linden: you can request to add to it at anytime
- [8:59] Zero Linden: Just remember, it is about architecture of SL, present and future
- [8:59] Umeko Kawanishi: i have questions about any SL performance testing. who should i talk to?
- [9:00] Arawn Spitteler: Would that be Which or Benjamin? I know they have hours today.
- [9:00] Leffard Lassard: LLSD has also the problem. XML serializing there defines a utf8 charset but the parser (at least the libsl one) does't support this
- [9:01] Zero Linden: Probably Aric Linden
- [9:01] Umeko Kawanishi: ok thanks guys
- [9:01] Zero Linden: Leffard - what? It is a required part of the XML 1.0 spec that all parsers MUST support UTF-8
- [9:02] Zha Ewry: What's the de-facto tho?
- [9:02] Zero Linden: That is - any XML parser MUST be able to parse UTF-8 encoded XML documents --- the spec doesn't care a wit what encoding the application itself uses
- [9:02] Leffard Lassard: Aha, I see. Perhaps I am wrong here and the xml-reader does supports a 4byte type.
- [9:02] Morgaine Dinova: I'd like to see a shortlist of the parts of infrastructure that are affected by inrnationalisation issues. A wiki page on the subject would be useful.
- [9:03] Kristoffer Drake: I agree with this - much more likely bits won't get missed
- [9:03] Zero Linden: Zha - The XML spec requires support for both UTF-8 and UTF-16 encodings, and makes UTF-8 the default if no encoding is mentioned
- [9:03] Tree Kyomoon: so if overhead isnt an issue, why wouldnt mysql5 and php support the higher UTFs?
- [9:03] Zero Linden: hence, UTF-8 is most common that I've seen, though in Asia UTF-16 might be more common
- [9:04] Arawn Spitteler: doesn't know the relation of UTF-8 and Latin-1
- [9:05] Morgaine Dinova: Bah. Just go for 128 bits and be safe for the forseeable future, like with IPv6 and ZFS ;-)
- [9:06] Arawn Spitteler: envisions a text-chat in hand drawn fonts, using UTF-Key
- [9:06] Zero Linden: Arawn - Latin1 is a character set, ISO -8860-1, which maps numbers in the range 0 - 255 onto character codes... though 64 of them are control codes that have not graphical character
- [9:06] Zero Linden: Latin1 is usually encoded by placing one code point per byte
- [9:06] Zero Linden: UTF-8 is an encoding of the Unicode, which places each code point in between 1 and 4 bytes
- [9:07] Zero Linden: Unicode is a mapping between numbers in the range 0 and 0x10FFFF and character codes
- [9:07] Zero Linden: though not all ar assigned!
- [9:07] Leffard Lassard: I have a reference to .net and unicode: [1] They say something contradictory unfortunately.
- [9:07] Arawn Spitteler: So, Latin-1 could be mapped to UTF-2?
- [9:07] Zero Linden: What is UTF-2?
- [9:08] Zero Linden: Latin1 is effectively both a character mapping and an encoding
- [9:08] Zero Linden: Here is the common confusion
- [9:08] Zero Linden: ASCII forms a proper subset of BOTH Latin1 and UTF-8
- [9:08] Tree Kyomoon: ?
- [9:08] Kristoffer Drake: Just what I was thinking Tree!
- [9:08] Leffard Lassard: Abstract is: chars are 16bit wide, strings are a sequence of chars basically. But reader classes do nevertheless support utf8.
- [9:09] Zero Linden: That is - the letter A is mapped to the same code point in Latin1 and Unicode, and the same single byte value in both Latin1 and UTF-8 encodings
- [9:09] Arawn Spitteler: 2?
- [9:10] Zero Linden: Leffard - do you know if they decode characters above U+FFFF into pairs of 16bit chars? This is what Java did since it too made the mistake of defining char to be 16 bits
- [9:11] Morgaine Dinova: This is all very interesting ... but I'm trying to see where it affects infrastructure.
- [9:11] Zha Ewry: It impacts what people wil have to/want to buid
- [9:11] Zha Ewry: In terms of parser/on wire formats
- [9:12] Zha Ewry: Not a huge hit, but a hit
- [9:12] Zero Linden: Leffard - can you parse <x>𐄷</x> and see how many characters you get?
- [9:12] Tree Kyomoon: seems like if the infrastructure has to support UTF 16 everywhere, it would be more overhead
- [9:12] Leffard Lassard: Hmm. I will look at this page. It doesnt state it explicitly. I can do that.
- [9:12] Zero Linden: Morgaine - it matters because we need to be clear with what we think we are dealing with - too much laxness in this area leads to things like engineers just
- [9:13] Zero Linden: thinking that characters are 16 bit
- [9:13] Zha Ewry: Exactly
- [9:13] Zha Ewry: We're setting the bar for parsers and data structures
- [9:13] Saijanai Kuhn: anyone have the first 40 minutes LOL/
- [9:13] Zha Ewry: "You must expect and handle a 4 byte field, here, HERE<, and HERE"
- [9:13] Tree Kyomoon: 𐄷
- [9:13] Saijanai Kuhn: transcript thereof?
- [9:13] Morgaine Dinova: Zero: yeah, but you still haven't answered WHERE, merely WHY :-)
- [9:14] Zero Linden: or thinking that they can push arbitrary binary data through
- [9:14] Zero Linden: well - in all our string fields
- [9:14] Zero Linden: I want us to be aware that we thing strings are Unicode Strings - further refined by what XML can handle (it outlaws 13 control codes)
- [9:15] Zha Ewry: This is one of those tiny, but vital things
- [9:15] Saijanai Kuhn: accepted your inventory offer.
- [9:15] Zero Linden: And then we have to be aware that if we ever say something like "the Foobar with a name that matches the Whatsit title"
- [9:15] Zero Linden: that this inovles normalization and comparison
- [9:15] Saijanai Kuhn: Thanks tree. Still can't puzzle out thatmessage though :-(
- [9:16] Zero Linden: 𐄷 is the XML character reference to U+10137 AEGEAN WEIGGHT BASE UNIT
- [9:16] Zero Linden: which look like
- [9:16] Zero Linden: which - oddly enough, my Mac has a glyph for! but it doesn't display in SL
- [9:17] Saijanai Kuhn: the wrong fonts are merged on-the-fly for htat
- [9:18] Tree Kyomoon: loves playing with Google Translator
- [9:18] Zha Ewry: listens and watches
- [9:19] Zero Linden: Leffard - I would really appreciate it if you can try parsing that XML document and letting me know what it produces
- [9:19] Zero Linden: e-mail me the result, if you would...
- [9:21] Saijanai Kuhn: ZEro, a tiny documentation issue for a sec?
- [9:21] Zero Linden: Please, Sai
- [9:21] Saijanai Kuhn: admires teh program that took over his screen while he was typing
- [9:21] Tree Kyomoon: internet exploder just shows it as a neato little square
- [9:22] Saijanai Kuhn: the current docs are based on the text-only format of the RFC. MS (and probably others) add a little eyecandy to their online/pdf version to make it more readable while not making it any harder to convert to text-only
- [9:23] Zero Linden: by the way, just so you know that this stuff *matters* - last Wednesday, I got pulled out of a meeting to help five other engineers puzzle out a problem in the database
- [9:23] Saijanai Kuhn: things like itallics, bold face and two-tone tables to list attributes
- [9:23] Zero Linden: related to a group anme that had this character in it:
- [9:23] Tree Kyomoon: awww
- [9:23] Zero Linden: Which, for you Mac people (self included) is U+2764 HEAVY BLACK HEART
- [9:24] Zero Linden: which is how we felt after it took the five of us an hour to puzzle it out, then two senior devs five hours more to repair the database
- [9:24] Saijanai Kuhn: has a little object that shows the difference between chat, and floating text for utf. Very confusing at times
- [9:24] Zero Linden: and all because of lack of attention to the charset encoding issues
- [9:24] Tree Kyomoon: its a heart on my pc running vista too
- [9:24] Saijanai Kuhn: anyway, was wondering if we could add some simple eyecandy to the draft SLGOGP to make it easier to read as long as it didn't make it harder to turn into a text-only document
- [9:24] Zha Ewry: and XP
- [9:25] Zero Linden: Saijanai - which version do you mean, the wikitext version?
- [9:25] Zero Linden: I generate the docs as follows
- [9:26] Saijanai Kuhn: right
- [9:26] Zero Linden: XML Specification based source --(xslt)--> HTML --(print menu in Safari)--> PDF
- [9:26] Zero Linden: XML Specification based source --(xslt)--> MediaWiki wikitext
- [9:26] Tree Kyomoon: for posterity we should try to describe in the log what 10137 is
- [9:27] Tree Kyomoon: unfortunately its just a square here too: [2]
- [9:27] Zha Ewry: waves to posteirity
- [9:27] Saijanai Kuhn: does the XML Specification not support, say, 2-tone table output?
- [9:27] Zero Linden: for posterity: U+10137 looks like hanging balance scale: a T with a small triangle hanging off each arm of the T
- [9:27] Saijanai Kuhn: e.g. page 13-14 of this: [3]
- [9:28] Zero Linden: XML Specification doesn't say anything about how the rows of the table should be presented... that's all in the XSLT
- [9:28] Zero Linden: and the .css that accompanies the HTML version
- [9:28] Tree Kyomoon: searches my UTF 16 keyboard for the mini scale key....
- [9:28] Leffard Lassard: Zero: Yeah, but I dont get it right now. I can try a time and send you an IM or email with a testprogram for mono that shows the results.
- [9:29] Zero Linden: Leffard - that would be wonderul - thank you
- [9:29] Saijanai Kuhn: also, is there a place I can find that. I've been manually tweaking the strawman login rez_avatar to look like the SLGOGP. I guess I should be working with the XML SPecification stuff directly instead of tweaking wiki s-ecific code
- [9:29] Saijanai Kuhn: wiki-specific *
- [9:29] Zero Linden: uhm, yes - I'll put up my tool chain info
- [9:30] Tree Kyomoon: pities the foo that had to carve out the original UTF 16 character set in the 17th century
- [9:30] Zero Linden: okay all - thanks for indulging my Unicode fetish
- [9:30] Saijanai Kuhn: You indicated that this would become the stndard format for all future protocol docs, right?
- [9:30] Zero Linden: (whcih, I admit is a personal joy of mine!)
- [9:30] Zero Linden: I've got to run....
- [9:31] Zero Linden: until next week
- [9:31] Wyn Galbraith: Thanks for the meeting Zero.
- [9:31] Kristoffer Drake: ok, cheerio
- [9:31] Saijanai Kuhn: admires anyone that mages to obsess sucessfully over unicode
- [9:31] Qie Niangao: thanks, Zero
- [9:31] Tree Kyomoon: ciao zero!
- [9:31] Wyn Galbraith: has to run as well. C U 8tr!
- [9:31] Saijanai Kuhn: later Zero