User:Zero Linden/Office Hours/2008 Apr 17

From Second Life Wiki

Second Life Wiki > Zero Linden/Office Hours/2008 Apr 17
Jump to: navigation, search
  • [8:30] Wyn Galbraith: has seen the black helios
  • [8:30] Arawn Spitteler: FEMA Heliocopters are only the tip of the iceberg
  • [8:30] Wyn Galbraith: What about the training of terrorists
  • [8:30] Minos Deerhunter: hello
  • [8:30] Wyn Galbraith: I couldn't believe that when I heard it on the congress tape.
  • [8:30] Arawn Spitteler: Civilization has always been a ballance of Community and Integrity, and the Rich are the Community Leaders.
  • [8:31] Arawn Spitteler: The Rich have certainly always been in charge, of the traiining of terrorists.
  • [8:31] Tree Kyomoon: the only hope for the poor is theres so damn many of us :)
  • [8:31] Wyn Galbraith: LOL
  • [8:32] Arawn Spitteler: What was on the Congress Tape? Was this Phillip's Presentation?
  • [8:32] Wyn Galbraith: waves at Zha
  • [8:33] Arawn Spitteler: whispers: Is that Torley?
  • [8:33] Tree Kyomoon: greetings madame zha
  • [8:33] Arawn Spitteler: Naw, it's just Giorno, dressed as Torley.
  • [8:33] Zha Ewry: LOL
  • [8:33] Rex Cronon: hi everybody
  • [8:34] Rex Cronon: what is with this physics lag?
  • [8:34] Arawn Spitteler: When Dazzle becomes mainstream, I'd like to see the Torley Skin, but not as a constant diet.
  • [8:34] Tree Kyomoon: Ive long hoped for a physics lag in RL.
  • [8:35] Xugu Madison: Very... Wily E. Coyote
  • [8:35] Arawn Spitteler: has a physics lag, but it interfere's with the Cultural Interface, and the paying of the Electric Bill.
  • [8:35] Tree Kyomoon: indeed
  • [8:35] Arawn Spitteler: On hte Meat Side, your sim has to interact with your neighbor's sim, and slower sims don't belong where the Money is.
  • [8:36] Wyn Galbraith: LOL, we had that conversation last night, the Coyote RoadRunner thing
  • [8:36] Tree Kyomoon: here comes mr zero
  • [8:36] Zha Ewry: beeps
  • [8:37] Wyn Galbraith: Morning Zero (meeps)
  • [8:37] Tree Kyomoon: its the zero effect
  • [8:37] Arawn Spitteler: welcomes Zero to his office hour: What agenda were we just working on? Yes, the impact of SL on the Agent Domain...
  • [8:37] Zha Ewry: nabs zero's coffee mug, as it goes by, and takes a quickhit
  • [8:37] Wyn Galbraith: Zero is wearing the ever so pleasant grey.
  • [8:38] Zero Linden: is fighting his dazzle camera
  • [8:38] Morgaine Dinova: 'Morning Zero
  • [8:38] Zero Linden: welcome all
  • [8:38] Zha Ewry: Bouncing about is it Zero?
  • [8:38] Zero Linden: sorry I'm a bit late
  • [8:38] Zero Linden: so - there is an agenda already?
  • [8:38] Zero Linden: lay it on me
  • [8:38] Tree Kyomoon: we will recalibrate the time space continuum for you
  • [8:38] Morgaine Dinova: I missed last Tues (no transcript?), so can't suggest any agenda.
  • [8:39] Wyn Galbraith: Coyote or Road Runner? To be honest I think Sai has the agenda.
  • [8:39] Tree Kyomoon: sorry about that, I was out last week
  • [8:39] Arawn Spitteler: We were just discussing Physics Lag on the Meat Side. It occurs to me, that such parrallels could be of benefit in this discussion.
  • [8:40] Zero Linden: well - I have one item: Unicode
  • [8:40] Arawn Spitteler's: style of Brainstorming cherishes the fertility of Bull Shit: What were you planning?
  • [8:40] Tree Kyomoon: well Im going to try to put http request cookies on the ole agenda again as my ole standby
  • [8:41] Arawn Spitteler: Unicode, is that the words we use for Characters?
  • [8:41] Zero Linden: That, tree, is more of a current LSL feature request than an architectural item....
  • [8:41] Zha Ewry: 8IN
  • [8:41] Zero Linden: Tree, is there a pjira entry for that
  • [8:41] Zha Ewry: I*N
  • [8:41] Tree Kyomoon: yes
  • [8:42] Tree Kyomoon: I mean, a general webservices model would be good enough
  • [8:43] Arawn Spitteler: would like Client-Side HUD, and wonders if a discussion of that would belong in an Architectural Primer.
  • [8:43] Tree Kyomoon: I just want full access to my island from my webserver
  • [8:43] Zero Linden: AHA - found the workaround for Mac/Mightmouse/Flycam problem
  • [8:43] Zha Ewry: Oh?
  • [8:44] Zha Ewry: Dish
  • [8:44] Morgaine Dinova: Selling your Mac? ;-))
  • [8:44] Xugu Madison: I'm wondering if anyone has an idea of timescale for islands hosted outside LL, similar to the IBM setup, being available...
  • [8:44] Arawn Spitteler: Is that the Camera Smoothing Problem?
  • [8:44] Zero Linden: Preferences > Input & Camera > Joystick Setup > Enable Joystick checkbox set to off
  • [8:44] Tree Kyomoon: yes, Zha if you have an update or info on IBM's plans
  • [8:45] Zero Linden: For some reason it thinks the MightyMouse is a joystick
  • [8:45] Tree Kyomoon: that would be good, i'll store them in some droids and pass them to obiwan
  • [8:45] Zero Linden: Okay - welllllllll
  • [8:45] Zero Linden: I'm going to lead with Unicode
  • [8:46] Zero Linden: I've been working internally on a project to draw up some Unicode guidelines for the company
  • [8:46] Zero Linden: and it brought up some issues that I wondered if anyone here had
  • [8:46] Zero Linden: experience with
  • [8:47] Zero Linden: Turns out, not all big, popular, open source software is truly Unicode compliant
  • [8:47] Zero Linden: Like MySQL 5 and PHP 5
  • [8:47] Zero Linden: Feh!
  • [8:47] Xugu Madison: MySQL isn't unicode compliant? Never had a problem here...
  • [8:47] Zero Linden: In particular, they don't support characters above the Basic Multilingual Plane (that is U+0000 through U+FFFF)
  • [8:47] Leffard Lassard: And mono/.net either. It only supports 2byte characters.
  • [8:47] Arawn Spitteler: Are these still using Latin1?
  • [8:47] Zha Ewry: could offer you a nice industrial Database product.
  • [8:48] Zero Linden: Arawn - they both have UTF8 encoding modes, but only support 16 bit characters
  • [8:48] Zero Linden: There are various ways of tricking them.... and SL does trick them
  • [8:48] Zero Linden: but we sometimes get caught
  • [8:48] Tree Kyomoon: isnt that UTF 16?
  • [8:48] Zero Linden: No, Tree
  • [8:49] Zero Linden: UTF-8 is an encoding scheme that encodes Unicode code points in between 1 and 4 bytes
  • [8:49] Zero Linden: code points that fit within 16 bits sometimes still take 3 byts in UTF-8
  • [8:49] Zha Ewry: The 3 and 4 byte code points are painful for a lot of people
  • [8:49] Leffard Lassard: Btw. what is the content above the basic multilingual plane?
  • [8:49] Tree Kyomoon: so ASCII mabey is all folks cared about
  • [8:49] Zero Linden: Leffard - do you know if Mono/.net is only supporting 16 bit characters, or is it (like Java pre 1.5) UTF-16,
  • [8:50] Zero Linden: that is supports the extended characters via UTF-16 surrogate pairs?
  • [8:50] Zero Linden: Leffard - well Plane 1 has things like Cuneiform and Linear B
  • [8:50] Leffard Lassard: So, I hacked recently mono and I believe the mono documentation mentions only 16bit characters as the character type.
  • [8:50] Zero Linden: which woudl be fun to chat in, but no great loss (no one has the fonts, anyway)
  • [8:50] Tao Takashi: Hi
  • [8:51] Zero Linden: But Plane 2 has a huge list of Chinese compatibility characters
  • [8:51] Leffard Lassard: I see. So chinese people are sol.
  • [8:51] Tree Kyomoon: wouldnt that introduce a huge amount of overhead as well? its like 20MB in font size
  • [8:51] Zero Linden: to implement lossless round trip against China's GB18030 character set
  • [8:52] Zero Linden: No, actually most Chinese, I think, is entered using the complete character set that is already
  • [8:52] Zero Linden: within Plane 0
  • [8:52] Arawn Spitteler: What's a Plane?
  • [8:52] Zero Linden: Oh
  • [8:52] Zero Linden: Unicode consists of 17 planes of 65,536 code points
  • [8:52] Zero Linden: (yes, 17, not 16)
  • [8:52] Tree Kyomoon: in online training we use a subset of the Chinese character sets
  • [8:53] Morgaine Dinova: What parts of infrastructure are affected by internationalization? (Excluding client)
  • [8:53] Zero Linden: Plane 0, has code points U+0000 U+FFFF and covers almost everything you are ever likely to actually see
  • [8:53] Zero Linden: Notice that it fits in 16 bits
  • [8:53] Arawn Spitteler: Including the 6,000 characters of Chinese.
  • [8:54] Zero Linden: Plane 2 has code points U+20000 through U+2FFFF and has these compatibility characters
  • [8:54] Tree Kyomoon: well in Traditional theres something like 18000
  • [8:54] Zero Linden: Morgaine - well, Unicode compliance is a subtle thing, actually
  • [8:54] Morgaine Dinova: IM is probably 8-bit clean, so I assume it remains clean regardless of char depth.
  • [8:54] Zero Linden: for us I think it means that we have to agree on certain expectations of compliance
  • [8:55] Zero Linden: for example - an easy one is this: "Human readable text and names are entered, stored, generated and displayed as Unicode strings."
  • [8:55] Tree Kyomoon: aha there it is, "simplified chinese level 1 13741 glyphs"
  • [8:55] Zero Linden: Which implies that, no, you can't store arbitrary binary data in a parcel description
  • [8:55] Zha Ewry: And then ypou get to define, exactly which level of unicode
  • [8:56] Morgaine Dinova: Let's leave the client out of this though. "What's readable" from the client perspective doesn't affect most of the infrastructure.
  • [8:56] Zero Linden: Well, so long as you are compliant with 3.1 or later, you are basically future-compatible there
  • [8:57] Zero Linden: Well, it does, for example, in how one encodes such data
  • [8:57] Leffard Lassard: The problem is not being 8bit-clean. The problem is what does a language/runtime interpret as a character. And for instance mono interprets at most 16bit as a char.
  • [8:57] Tree Kyomoon: could you support the characters but make them a separate download like MS does?
  • [8:57] Zero Linden: so, for example, we can put the parcel name in the content of a <string>...</string> XML element
  • [8:57] Zero Linden: becuase of the above agreement
  • [8:58] Zero Linden: Tree - I'm not even worried about the ability of a computers font machinery to display them
  • [8:58] Zero Linden: the problem is that if you store them in your back end database and then retrieve them
  • [8:58] Tree Kyomoon: ahhh you need to store high bit parcel names etc. Right gotcha
  • [8:58] Zero Linden: we have to agree if it is "required", "strrongy suggested" or just "optional" that you retain them
  • [8:58] Umeko Kawanishi: zero--my first to your office hour, do you usually have an agenda or anybody can ask a question.
  • [8:59] Zero Linden: Umeko - I asked for agenda up front
  • [8:59] Umeko Kawanishi: oh i see. so i was late
  • [8:59] Zero Linden: you can request to add to it at anytime
  • [8:59] Zero Linden: Just remember, it is about architecture of SL, present and future
  • [8:59] Umeko Kawanishi: i have questions about any SL performance testing. who should i talk to?
  • [9:00] Arawn Spitteler: Would that be Which or Benjamin? I know they have hours today.
  • [9:00] Leffard Lassard: LLSD has also the problem. XML serializing there defines a utf8 charset but the parser (at least the libsl one) does't support this
  • [9:01] Zero Linden: Probably Aric Linden
  • [9:01] Umeko Kawanishi: ok thanks guys
  • [9:01] Zero Linden: Leffard - what? It is a required part of the XML 1.0 spec that all parsers MUST support UTF-8
  • [9:02] Zha Ewry: What's the de-facto tho?
  • [9:02] Zero Linden: That is - any XML parser MUST be able to parse UTF-8 encoded XML documents --- the spec doesn't care a wit what encoding the application itself uses
  • [9:02] Leffard Lassard: Aha, I see. Perhaps I am wrong here and the xml-reader does supports a 4byte type.
  • [9:02] Morgaine Dinova: I'd like to see a shortlist of the parts of infrastructure that are affected by inrnationalisation issues. A wiki page on the subject would be useful.
  • [9:03] Kristoffer Drake: I agree with this - much more likely bits won't get missed
  • [9:03] Zero Linden: Zha - The XML spec requires support for both UTF-8 and UTF-16 encodings, and makes UTF-8 the default if no encoding is mentioned
  • [9:03] Tree Kyomoon: so if overhead isnt an issue, why wouldnt mysql5 and php support the higher UTFs?
  • [9:03] Zero Linden: hence, UTF-8 is most common that I've seen, though in Asia UTF-16 might be more common
  • [9:04] Arawn Spitteler: doesn't know the relation of UTF-8 and Latin-1
  • [9:05] Morgaine Dinova: Bah. Just go for 128 bits and be safe for the forseeable future, like with IPv6 and ZFS ;-)
  • [9:06] Arawn Spitteler: envisions a text-chat in hand drawn fonts, using UTF-Key
  • [9:06] Zero Linden: Arawn - Latin1 is a character set, ISO -8860-1, which maps numbers in the range 0 - 255 onto character codes... though 64 of them are control codes that have not graphical character
  • [9:06] Zero Linden: Latin1 is usually encoded by placing one code point per byte
  • [9:06] Zero Linden: UTF-8 is an encoding of the Unicode, which places each code point in between 1 and 4 bytes
  • [9:07] Zero Linden: Unicode is a mapping between numbers in the range 0 and 0x10FFFF and character codes
  • [9:07] Zero Linden: though not all ar assigned!
  • [9:07] Leffard Lassard: I have a reference to .net and unicode: [1] They say something contradictory unfortunately.
  • [9:07] Arawn Spitteler: So, Latin-1 could be mapped to UTF-2?
  • [9:07] Zero Linden: What is UTF-2?
  • [9:08] Zero Linden: Latin1 is effectively both a character mapping and an encoding
  • [9:08] Zero Linden: Here is the common confusion
  • [9:08] Zero Linden: ASCII forms a proper subset of BOTH Latin1 and UTF-8
  • [9:08] Tree Kyomoon:  ?
  • [9:08] Kristoffer Drake: Just what I was thinking Tree!
  • [9:08] Leffard Lassard: Abstract is: chars are 16bit wide, strings are a sequence of chars basically. But reader classes do nevertheless support utf8.
  • [9:09] Zero Linden: That is - the letter A is mapped to the same code point in Latin1 and Unicode, and the same single byte value in both Latin1 and UTF-8 encodings
  • [9:09] Arawn Spitteler: 2?
  • [9:10] Zero Linden: Leffard - do you know if they decode characters above U+FFFF into pairs of 16bit chars? This is what Java did since it too made the mistake of defining char to be 16 bits
  • [9:11] Morgaine Dinova: This is all very interesting ... but I'm trying to see where it affects infrastructure.
  • [9:11] Zha Ewry: It impacts what people wil have to/want to buid
  • [9:11] Zha Ewry: In terms of parser/on wire formats
  • [9:12] Zha Ewry: Not a huge hit, but a hit
  • [9:12] Zero Linden: Leffard - can you parse <x>𐄷</x> and see how many characters you get?
  • [9:12] Tree Kyomoon: seems like if the infrastructure has to support UTF 16 everywhere, it would be more overhead
  • [9:12] Leffard Lassard: Hmm. I will look at this page. It doesnt state it explicitly. I can do that.
  • [9:12] Zero Linden: Morgaine - it matters because we need to be clear with what we think we are dealing with - too much laxness in this area leads to things like engineers just
  • [9:13] Zero Linden: thinking that characters are 16 bit
  • [9:13] Zha Ewry: Exactly
  • [9:13] Zha Ewry: We're setting the bar for parsers and data structures
  • [9:13] Saijanai Kuhn: anyone have the first 40 minutes LOL/
  • [9:13] Zha Ewry: "You must expect and handle a 4 byte field, here, HERE<, and HERE"
  • [9:13] Tree Kyomoon: 𐄷
  • [9:13] Saijanai Kuhn: transcript thereof?
  • [9:13] Morgaine Dinova: Zero: yeah, but you still haven't answered WHERE, merely WHY :-)
  • [9:14] Zero Linden: or thinking that they can push arbitrary binary data through
  • [9:14] Zero Linden: well - in all our string fields
  • [9:14] Zero Linden: I want us to be aware that we thing strings are Unicode Strings - further refined by what XML can handle (it outlaws 13 control codes)
  • [9:15] Zha Ewry: This is one of those tiny, but vital things
  • [9:15] Saijanai Kuhn: accepted your inventory offer.
  • [9:15] Zero Linden: And then we have to be aware that if we ever say something like "the Foobar with a name that matches the Whatsit title"
  • [9:15] Zero Linden: that this inovles normalization and comparison
  • [9:15] Saijanai Kuhn: Thanks tree. Still can't puzzle out thatmessage though :-(
  • [9:16] Zero Linden: 𐄷 is the XML character reference to U+10137 AEGEAN WEIGGHT BASE UNIT
  • [9:16] Zero Linden: which look like
  • [9:16] Zero Linden: which - oddly enough, my Mac has a glyph for! but it doesn't display in SL
  • [9:17] Saijanai Kuhn: the wrong fonts are merged on-the-fly for htat
  • [9:18] Tree Kyomoon: loves playing with Google Translator
  • [9:18] Zha Ewry: listens and watches
  • [9:19] Zero Linden: Leffard - I would really appreciate it if you can try parsing that XML document and letting me know what it produces
  • [9:19] Zero Linden: e-mail me the result, if you would...
  • [9:21] Saijanai Kuhn: ZEro, a tiny documentation issue for a sec?
  • [9:21] Zero Linden: Please, Sai
  • [9:21] Saijanai Kuhn: admires teh program that took over his screen while he was typing
  • [9:21] Tree Kyomoon: internet exploder just shows it as a neato little square
  • [9:22] Saijanai Kuhn: the current docs are based on the text-only format of the RFC. MS (and probably others) add a little eyecandy to their online/pdf version to make it more readable while not making it any harder to convert to text-only
  • [9:23] Zero Linden: by the way, just so you know that this stuff *matters* - last Wednesday, I got pulled out of a meeting to help five other engineers puzzle out a problem in the database
  • [9:23] Saijanai Kuhn: things like itallics, bold face and two-tone tables to list attributes
  • [9:23] Zero Linden: related to a group anme that had this character in it:
  • [9:23] Tree Kyomoon: awww
  • [9:23] Zero Linden: Which, for you Mac people (self included) is U+2764 HEAVY BLACK HEART
  • [9:24] Zero Linden: which is how we felt after it took the five of us an hour to puzzle it out, then two senior devs five hours more to repair the database
  • [9:24] Saijanai Kuhn: has a little object that shows the difference between chat, and floating text for utf. Very confusing at times
  • [9:24] Zero Linden: and all because of lack of attention to the charset encoding issues
  • [9:24] Tree Kyomoon: its a heart on my pc running vista too
  • [9:24] Saijanai Kuhn: anyway, was wondering if we could add some simple eyecandy to the draft SLGOGP to make it easier to read as long as it didn't make it harder to turn into a text-only document
  • [9:24] Zha Ewry: and XP
  • [9:25] Zero Linden: Saijanai - which version do you mean, the wikitext version?
  • [9:25] Zero Linden: I generate the docs as follows
  • [9:26] Saijanai Kuhn: right
  • [9:26] Zero Linden: XML Specification based source --(xslt)--> HTML --(print menu in Safari)--> PDF
  • [9:26] Zero Linden: XML Specification based source --(xslt)--> MediaWiki wikitext
  • [9:26] Tree Kyomoon: for posterity we should try to describe in the log what 10137 is
  • [9:27] Tree Kyomoon: unfortunately its just a square here too: [2]
  • [9:27] Zha Ewry: waves to posteirity
  • [9:27] Saijanai Kuhn: does the XML Specification not support, say, 2-tone table output?
  • [9:27] Zero Linden: for posterity: U+10137 looks like hanging balance scale: a T with a small triangle hanging off each arm of the T
  • [9:27] Saijanai Kuhn: e.g. page 13-14 of this: [3]
  • [9:28] Zero Linden: XML Specification doesn't say anything about how the rows of the table should be presented... that's all in the XSLT
  • [9:28] Zero Linden: and the .css that accompanies the HTML version
  • [9:28] Tree Kyomoon: searches my UTF 16 keyboard for the mini scale key....
  • [9:28] Leffard Lassard: Zero: Yeah, but I dont get it right now. I can try a time and send you an IM or email with a testprogram for mono that shows the results.
  • [9:29] Zero Linden: Leffard - that would be wonderul - thank you
  • [9:29] Saijanai Kuhn: also, is there a place I can find that. I've been manually tweaking the strawman login rez_avatar to look like the SLGOGP. I guess I should be working with the XML SPecification stuff directly instead of tweaking wiki s-ecific code
  • [9:29] Saijanai Kuhn: wiki-specific *
  • [9:29] Zero Linden: uhm, yes - I'll put up my tool chain info
  • [9:30] Tree Kyomoon: pities the foo that had to carve out the original UTF 16 character set in the 17th century
  • [9:30] Zero Linden: okay all - thanks for indulging my Unicode fetish
  • [9:30] Saijanai Kuhn: You indicated that this would become the stndard format for all future protocol docs, right?
  • [9:30] Zero Linden: (whcih, I admit is a personal joy of mine!)
  • [9:30] Zero Linden: I've got to run....
  • [9:31] Zero Linden: until next week
  • [9:31] Wyn Galbraith: Thanks for the meeting Zero.
  • [9:31] Kristoffer Drake: ok, cheerio
  • [9:31] Saijanai Kuhn: admires anyone that mages to obsess sucessfully over unicode
  • [9:31] Qie Niangao: thanks, Zero
  • [9:31] Tree Kyomoon: ciao zero!
  • [9:31] Wyn Galbraith: has to run as well. C U 8tr!
  • [9:31] Saijanai Kuhn: later Zero
Personal tools