User:Zero Linden/Office Hours/2008 Apr 17

From Second Life Wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
  • [8:30] Wyn Galbraith: has seen the black helios
  • [8:30] Arawn Spitteler: FEMA Heliocopters are only the tip of the iceberg
  • [8:30] Wyn Galbraith: What about the training of terrorists
  • [8:30] Minos Deerhunter: hello
  • [8:30] Wyn Galbraith: I couldn't believe that when I heard it on the congress tape.
  • [8:30] Arawn Spitteler: Civilization has always been a ballance of Community and Integrity, and the Rich are the Community Leaders.
  • [8:31] Arawn Spitteler: The Rich have certainly always been in charge, of the traiining of terrorists.
  • [8:31] Tree Kyomoon: the only hope for the poor is theres so damn many of us :)
  • [8:31] Wyn Galbraith: LOL
  • [8:32] Arawn Spitteler: What was on the Congress Tape? Was this Phillip's Presentation?
  • [8:32] Wyn Galbraith: waves at Zha
  • [8:33] Arawn Spitteler: whispers: Is that Torley?
  • [8:33] Tree Kyomoon: greetings madame zha
  • [8:33] Arawn Spitteler: Naw, it's just Giorno, dressed as Torley.
  • [8:33] Zha Ewry: LOL
  • [8:33] Rex Cronon: hi everybody
  • [8:34] Rex Cronon: what is with this physics lag?
  • [8:34] Arawn Spitteler: When Dazzle becomes mainstream, I'd like to see the Torley Skin, but not as a constant diet.
  • [8:34] Tree Kyomoon: Ive long hoped for a physics lag in RL.
  • [8:35] Xugu Madison: Very... Wily E. Coyote
  • [8:35] Arawn Spitteler: has a physics lag, but it interfere's with the Cultural Interface, and the paying of the Electric Bill.
  • [8:35] Tree Kyomoon: indeed
  • [8:35] Arawn Spitteler: On hte Meat Side, your sim has to interact with your neighbor's sim, and slower sims don't belong where the Money is.
  • [8:36] Wyn Galbraith: LOL, we had that conversation last night, the Coyote RoadRunner thing
  • [8:36] Tree Kyomoon: here comes mr zero
  • [8:36] Zha Ewry: beeps
  • [8:37] Wyn Galbraith: Morning Zero (meeps)
  • [8:37] Tree Kyomoon: its the zero effect
  • [8:37] Arawn Spitteler: welcomes Zero to his office hour: What agenda were we just working on? Yes, the impact of SL on the Agent Domain...
  • [8:37] Zha Ewry: nabs zero's coffee mug, as it goes by, and takes a quickhit
  • [8:37] Wyn Galbraith: Zero is wearing the ever so pleasant grey.
  • [8:38] Zero Linden: is fighting his dazzle camera
  • [8:38] Morgaine Dinova: 'Morning Zero
  • [8:38] Zero Linden: welcome all
  • [8:38] Zha Ewry: Bouncing about is it Zero?
  • [8:38] Zero Linden: sorry I'm a bit late
  • [8:38] Zero Linden: so - there is an agenda already?
  • [8:38] Zero Linden: lay it on me
  • [8:38] Tree Kyomoon: we will recalibrate the time space continuum for you
  • [8:38] Morgaine Dinova: I missed last Tues (no transcript?), so can't suggest any agenda.
  • [8:39] Wyn Galbraith: Coyote or Road Runner? To be honest I think Sai has the agenda.
  • [8:39] Tree Kyomoon: sorry about that, I was out last week
  • [8:39] Arawn Spitteler: We were just discussing Physics Lag on the Meat Side. It occurs to me, that such parrallels could be of benefit in this discussion.
  • [8:40] Zero Linden: well - I have one item: Unicode
  • [8:40] Arawn Spitteler's: style of Brainstorming cherishes the fertility of Bull Shit: What were you planning?
  • [8:40] Tree Kyomoon: well Im going to try to put http request cookies on the ole agenda again as my ole standby
  • [8:41] Arawn Spitteler: Unicode, is that the words we use for Characters?
  • [8:41] Zero Linden: That, tree, is more of a current LSL feature request than an architectural item....
  • [8:41] Zha Ewry: 8IN
  • [8:41] Zero Linden: Tree, is there a pjira entry for that
  • [8:41] Zha Ewry: I*N
  • [8:41] Tree Kyomoon: yes
  • [8:42] Tree Kyomoon: I mean, a general webservices model would be good enough
  • [8:43] Arawn Spitteler: would like Client-Side HUD, and wonders if a discussion of that would belong in an Architectural Primer.
  • [8:43] Tree Kyomoon: I just want full access to my island from my webserver
  • [8:43] Zero Linden: AHA - found the workaround for Mac/Mightmouse/Flycam problem
  • [8:43] Zha Ewry: Oh?
  • [8:44] Zha Ewry: Dish
  • [8:44] Morgaine Dinova: Selling your Mac? ;-))
  • [8:44] Xugu Madison: I'm wondering if anyone has an idea of timescale for islands hosted outside LL, similar to the IBM setup, being available...
  • [8:44] Arawn Spitteler: Is that the Camera Smoothing Problem?
  • [8:44] Zero Linden: Preferences > Input & Camera > Joystick Setup > Enable Joystick checkbox set to off
  • [8:44] Tree Kyomoon: yes, Zha if you have an update or info on IBM's plans
  • [8:45] Zero Linden: For some reason it thinks the MightyMouse is a joystick
  • [8:45] Tree Kyomoon: that would be good, i'll store them in some droids and pass them to obiwan
  • [8:45] Zero Linden: Okay - welllllllll
  • [8:45] Zero Linden: I'm going to lead with Unicode
  • [8:46] Zero Linden: I've been working internally on a project to draw up some Unicode guidelines for the company
  • [8:46] Zero Linden: and it brought up some issues that I wondered if anyone here had
  • [8:46] Zero Linden: experience with
  • [8:47] Zero Linden: Turns out, not all big, popular, open source software is truly Unicode compliant
  • [8:47] Zero Linden: Like MySQL 5 and PHP 5
  • [8:47] Zero Linden: Feh!
  • [8:47] Xugu Madison: MySQL isn't unicode compliant? Never had a problem here...
  • [8:47] Zero Linden: In particular, they don't support characters above the Basic Multilingual Plane (that is U+0000 through U+FFFF)
  • [8:47] Leffard Lassard: And mono/.net either. It only supports 2byte characters.
  • [8:47] Arawn Spitteler: Are these still using Latin1?
  • [8:47] Zha Ewry: could offer you a nice industrial Database product.
  • [8:48] Zero Linden: Arawn - they both have UTF8 encoding modes, but only support 16 bit characters
  • [8:48] Zero Linden: There are various ways of tricking them.... and SL does trick them
  • [8:48] Zero Linden: but we sometimes get caught
  • [8:48] Tree Kyomoon: isnt that UTF 16?
  • [8:48] Zero Linden: No, Tree
  • [8:49] Zero Linden: UTF-8 is an encoding scheme that encodes Unicode code points in between 1 and 4 bytes
  • [8:49] Zero Linden: code points that fit within 16 bits sometimes still take 3 byts in UTF-8
  • [8:49] Zha Ewry: The 3 and 4 byte code points are painful for a lot of people
  • [8:49] Leffard Lassard: Btw. what is the content above the basic multilingual plane?
  • [8:49] Tree Kyomoon: so ASCII mabey is all folks cared about
  • [8:49] Zero Linden: Leffard - do you know if Mono/.net is only supporting 16 bit characters, or is it (like Java pre 1.5) UTF-16,
  • [8:50] Zero Linden: that is supports the extended characters via UTF-16 surrogate pairs?
  • [8:50] Zero Linden: Leffard - well Plane 1 has things like Cuneiform and Linear B
  • [8:50] Leffard Lassard: So, I hacked recently mono and I believe the mono documentation mentions only 16bit characters as the character type.
  • [8:50] Zero Linden: which woudl be fun to chat in, but no great loss (no one has the fonts, anyway)
  • [8:50] Tao Takashi: Hi
  • [8:51] Zero Linden: But Plane 2 has a huge list of Chinese compatibility characters
  • [8:51] Leffard Lassard: I see. So chinese people are sol.
  • [8:51] Tree Kyomoon: wouldnt that introduce a huge amount of overhead as well? its like 20MB in font size
  • [8:51] Zero Linden: to implement lossless round trip against China's GB18030 character set
  • [8:52] Zero Linden: No, actually most Chinese, I think, is entered using the complete character set that is already
  • [8:52] Zero Linden: within Plane 0
  • [8:52] Arawn Spitteler: What's a Plane?
  • [8:52] Zero Linden: Oh
  • [8:52] Zero Linden: Unicode consists of 17 planes of 65,536 code points
  • [8:52] Zero Linden: (yes, 17, not 16)
  • [8:52] Tree Kyomoon: in online training we use a subset of the Chinese character sets
  • [8:53] Morgaine Dinova: What parts of infrastructure are affected by internationalization? (Excluding client)
  • [8:53] Zero Linden: Plane 0, has code points U+0000 U+FFFF and covers almost everything you are ever likely to actually see
  • [8:53] Zero Linden: Notice that it fits in 16 bits
  • [8:53] Arawn Spitteler: Including the 6,000 characters of Chinese.
  • [8:54] Zero Linden: Plane 2 has code points U+20000 through U+2FFFF and has these compatibility characters
  • [8:54] Tree Kyomoon: well in Traditional theres something like 18000
  • [8:54] Zero Linden: Morgaine - well, Unicode compliance is a subtle thing, actually
  • [8:54] Morgaine Dinova: IM is probably 8-bit clean, so I assume it remains clean regardless of char depth.
  • [8:54] Zero Linden: for us I think it means that we have to agree on certain expectations of compliance
  • [8:55] Zero Linden: for example - an easy one is this: "Human readable text and names are entered, stored, generated and displayed as Unicode strings."
  • [8:55] Tree Kyomoon: aha there it is, "simplified chinese level 1 13741 glyphs"
  • [8:55] Zero Linden: Which implies that, no, you can't store arbitrary binary data in a parcel description
  • [8:55] Zha Ewry: And then ypou get to define, exactly which level of unicode
  • [8:56] Morgaine Dinova: Let's leave the client out of this though. "What's readable" from the client perspective doesn't affect most of the infrastructure.
  • [8:56] Zero Linden: Well, so long as you are compliant with 3.1 or later, you are basically future-compatible there
  • [8:57] Zero Linden: Well, it does, for example, in how one encodes such data
  • [8:57] Leffard Lassard: The problem is not being 8bit-clean. The problem is what does a language/runtime interpret as a character. And for instance mono interprets at most 16bit as a char.
  • [8:57] Tree Kyomoon: could you support the characters but make them a separate download like MS does?
  • [8:57] Zero Linden: so, for example, we can put the parcel name in the content of a <string>...</string> XML element
  • [8:57] Zero Linden: becuase of the above agreement
  • [8:58] Zero Linden: Tree - I'm not even worried about the ability of a computers font machinery to display them
  • [8:58] Zero Linden: the problem is that if you store them in your back end database and then retrieve them
  • [8:58] Tree Kyomoon: ahhh you need to store high bit parcel names etc. Right gotcha
  • [8:58] Zero Linden: we have to agree if it is "required", "strrongy suggested" or just "optional" that you retain them
  • [8:58] Umeko Kawanishi: zero--my first to your office hour, do you usually have an agenda or anybody can ask a question.
  • [8:59] Zero Linden: Umeko - I asked for agenda up front
  • [8:59] Umeko Kawanishi: oh i see. so i was late
  • [8:59] Zero Linden: you can request to add to it at anytime
  • [8:59] Zero Linden: Just remember, it is about architecture of SL, present and future
  • [8:59] Umeko Kawanishi: i have questions about any SL performance testing. who should i talk to?
  • [9:00] Arawn Spitteler: Would that be Which or Benjamin? I know they have hours today.
  • [9:00] Leffard Lassard: LLSD has also the problem. XML serializing there defines a utf8 charset but the parser (at least the libsl one) does't support this
  • [9:01] Zero Linden: Probably Aric Linden
  • [9:01] Umeko Kawanishi: ok thanks guys
  • [9:01] Zero Linden: Leffard - what? It is a required part of the XML 1.0 spec that all parsers MUST support UTF-8
  • [9:02] Zha Ewry: What's the de-facto tho?
  • [9:02] Zero Linden: That is - any XML parser MUST be able to parse UTF-8 encoded XML documents --- the spec doesn't care a wit what encoding the application itself uses
  • [9:02] Leffard Lassard: Aha, I see. Perhaps I am wrong here and the xml-reader does supports a 4byte type.
  • [9:02] Morgaine Dinova: I'd like to see a shortlist of the parts of infrastructure that are affected by inrnationalisation issues. A wiki page on the subject would be useful.
  • [9:03] Kristoffer Drake: I agree with this - much more likely bits won't get missed
  • [9:03] Zero Linden: Zha - The XML spec requires support for both UTF-8 and UTF-16 encodings, and makes UTF-8 the default if no encoding is mentioned
  • [9:03] Tree Kyomoon: so if overhead isnt an issue, why wouldnt mysql5 and php support the higher UTFs?
  • [9:03] Zero Linden: hence, UTF-8 is most common that I've seen, though in Asia UTF-16 might be more common
  • [9:04] Arawn Spitteler: doesn't know the relation of UTF-8 and Latin-1
  • [9:05] Morgaine Dinova: Bah. Just go for 128 bits and be safe for the forseeable future, like with IPv6 and ZFS ;-)
  • [9:06] Arawn Spitteler: envisions a text-chat in hand drawn fonts, using UTF-Key
  • [9:06] Zero Linden: Arawn - Latin1 is a character set, ISO -8860-1, which maps numbers in the range 0 - 255 onto character codes... though 64 of them are control codes that have not graphical character
  • [9:06] Zero Linden: Latin1 is usually encoded by placing one code point per byte
  • [9:06] Zero Linden: UTF-8 is an encoding of the Unicode, which places each code point in between 1 and 4 bytes
  • [9:07] Zero Linden: Unicode is a mapping between numbers in the range 0 and 0x10FFFF and character codes
  • [9:07] Zero Linden: though not all ar assigned!
  • [9:07] Leffard Lassard: I have a reference to .net and unicode: [1] They say something contradictory unfortunately.
  • [9:07] Arawn Spitteler: So, Latin-1 could be mapped to UTF-2?
  • [9:07] Zero Linden: What is UTF-2?
  • [9:08] Zero Linden: Latin1 is effectively both a character mapping and an encoding
  • [9:08] Zero Linden: Here is the common confusion
  • [9:08] Zero Linden: ASCII forms a proper subset of BOTH Latin1 and UTF-8
  • [9:08] Tree Kyomoon:  ?
  • [9:08] Kristoffer Drake: Just what I was thinking Tree!
  • [9:08] Leffard Lassard: Abstract is: chars are 16bit wide, strings are a sequence of chars basically. But reader classes do nevertheless support utf8.
  • [9:09] Zero Linden: That is - the letter A is mapped to the same code point in Latin1 and Unicode, and the same single byte value in both Latin1 and UTF-8 encodings
  • [9:09] Arawn Spitteler: 2?
  • [9:10] Zero Linden: Leffard - do you know if they decode characters above U+FFFF into pairs of 16bit chars? This is what Java did since it too made the mistake of defining char to be 16 bits
  • [9:11] Morgaine Dinova: This is all very interesting ... but I'm trying to see where it affects infrastructure.
  • [9:11] Zha Ewry: It impacts what people wil have to/want to buid
  • [9:11] Zha Ewry: In terms of parser/on wire formats
  • [9:12] Zha Ewry: Not a huge hit, but a hit
  • [9:12] Zero Linden: Leffard - can you parse <x>𐄷</x> and see how many characters you get?
  • [9:12] Tree Kyomoon: seems like if the infrastructure has to support UTF 16 everywhere, it would be more overhead
  • [9:12] Leffard Lassard: Hmm. I will look at this page. It doesnt state it explicitly. I can do that.
  • [9:12] Zero Linden: Morgaine - it matters because we need to be clear with what we think we are dealing with - too much laxness in this area leads to things like engineers just
  • [9:13] Zero Linden: thinking that characters are 16 bit
  • [9:13] Zha Ewry: Exactly
  • [9:13] Zha Ewry: We're setting the bar for parsers and data structures
  • [9:13] Saijanai Kuhn: anyone have the first 40 minutes LOL/
  • [9:13] Zha Ewry: "You must expect and handle a 4 byte field, here, HERE<, and HERE"
  • [9:13] Tree Kyomoon: 𐄷
  • [9:13] Saijanai Kuhn: transcript thereof?
  • [9:13] Morgaine Dinova: Zero: yeah, but you still haven't answered WHERE, merely WHY :-)
  • [9:14] Zero Linden: or thinking that they can push arbitrary binary data through
  • [9:14] Zero Linden: well - in all our string fields
  • [9:14] Zero Linden: I want us to be aware that we thing strings are Unicode Strings - further refined by what XML can handle (it outlaws 13 control codes)
  • [9:15] Zha Ewry: This is one of those tiny, but vital things
  • [9:15] Saijanai Kuhn: accepted your inventory offer.
  • [9:15] Zero Linden: And then we have to be aware that if we ever say something like "the Foobar with a name that matches the Whatsit title"
  • [9:15] Zero Linden: that this inovles normalization and comparison
  • [9:15] Saijanai Kuhn: Thanks tree. Still can't puzzle out thatmessage though :-(
  • [9:16] Zero Linden: 𐄷 is the XML character reference to U+10137 AEGEAN WEIGGHT BASE UNIT
  • [9:16] Zero Linden: which look like
  • [9:16] Zero Linden: which - oddly enough, my Mac has a glyph for! but it doesn't display in SL
  • [9:17] Saijanai Kuhn: the wrong fonts are merged on-the-fly for htat
  • [9:18] Tree Kyomoon: loves playing with Google Translator
  • [9:18] Zha Ewry: listens and watches
  • [9:19] Zero Linden: Leffard - I would really appreciate it if you can try parsing that XML document and letting me know what it produces
  • [9:19] Zero Linden: e-mail me the result, if you would...
  • [9:21] Saijanai Kuhn: ZEro, a tiny documentation issue for a sec?
  • [9:21] Zero Linden: Please, Sai
  • [9:21] Saijanai Kuhn: admires teh program that took over his screen while he was typing
  • [9:21] Tree Kyomoon: internet exploder just shows it as a neato little square
  • [9:22] Saijanai Kuhn: the current docs are based on the text-only format of the RFC. MS (and probably others) add a little eyecandy to their online/pdf version to make it more readable while not making it any harder to convert to text-only
  • [9:23] Zero Linden: by the way, just so you know that this stuff *matters* - last Wednesday, I got pulled out of a meeting to help five other engineers puzzle out a problem in the database
  • [9:23] Saijanai Kuhn: things like itallics, bold face and two-tone tables to list attributes
  • [9:23] Zero Linden: related to a group anme that had this character in it:
  • [9:23] Tree Kyomoon: awww
  • [9:23] Zero Linden: Which, for you Mac people (self included) is U+2764 HEAVY BLACK HEART
  • [9:24] Zero Linden: which is how we felt after it took the five of us an hour to puzzle it out, then two senior devs five hours more to repair the database
  • [9:24] Saijanai Kuhn: has a little object that shows the difference between chat, and floating text for utf. Very confusing at times
  • [9:24] Zero Linden: and all because of lack of attention to the charset encoding issues
  • [9:24] Tree Kyomoon: its a heart on my pc running vista too
  • [9:24] Saijanai Kuhn: anyway, was wondering if we could add some simple eyecandy to the draft SLGOGP to make it easier to read as long as it didn't make it harder to turn into a text-only document
  • [9:24] Zha Ewry: and XP
  • [9:25] Zero Linden: Saijanai - which version do you mean, the wikitext version?
  • [9:25] Zero Linden: I generate the docs as follows
  • [9:26] Saijanai Kuhn: right
  • [9:26] Zero Linden: XML Specification based source --(xslt)--> HTML --(print menu in Safari)--> PDF
  • [9:26] Zero Linden: XML Specification based source --(xslt)--> MediaWiki wikitext
  • [9:26] Tree Kyomoon: for posterity we should try to describe in the log what 10137 is
  • [9:27] Tree Kyomoon: unfortunately its just a square here too: [2]
  • [9:27] Zha Ewry: waves to posteirity
  • [9:27] Saijanai Kuhn: does the XML Specification not support, say, 2-tone table output?
  • [9:27] Zero Linden: for posterity: U+10137 looks like hanging balance scale: a T with a small triangle hanging off each arm of the T
  • [9:27] Saijanai Kuhn: e.g. page 13-14 of this: [3]
  • [9:28] Zero Linden: XML Specification doesn't say anything about how the rows of the table should be presented... that's all in the XSLT
  • [9:28] Zero Linden: and the .css that accompanies the HTML version
  • [9:28] Tree Kyomoon: searches my UTF 16 keyboard for the mini scale key....
  • [9:28] Leffard Lassard: Zero: Yeah, but I dont get it right now. I can try a time and send you an IM or email with a testprogram for mono that shows the results.
  • [9:29] Zero Linden: Leffard - that would be wonderul - thank you
  • [9:29] Saijanai Kuhn: also, is there a place I can find that. I've been manually tweaking the strawman login rez_avatar to look like the SLGOGP. I guess I should be working with the XML SPecification stuff directly instead of tweaking wiki s-ecific code
  • [9:29] Saijanai Kuhn: wiki-specific *
  • [9:29] Zero Linden: uhm, yes - I'll put up my tool chain info
  • [9:30] Tree Kyomoon: pities the foo that had to carve out the original UTF 16 character set in the 17th century
  • [9:30] Zero Linden: okay all - thanks for indulging my Unicode fetish
  • [9:30] Saijanai Kuhn: You indicated that this would become the stndard format for all future protocol docs, right?
  • [9:30] Zero Linden: (whcih, I admit is a personal joy of mine!)
  • [9:30] Zero Linden: I've got to run....
  • [9:31] Zero Linden: until next week
  • [9:31] Wyn Galbraith: Thanks for the meeting Zero.
  • [9:31] Kristoffer Drake: ok, cheerio
  • [9:31] Saijanai Kuhn: admires anyone that mages to obsess sucessfully over unicode
  • [9:31] Qie Niangao: thanks, Zero
  • [9:31] Tree Kyomoon: ciao zero!
  • [9:31] Wyn Galbraith: has to run as well. C U 8tr!
  • [9:31] Saijanai Kuhn: later Zero