Message Liberation Forum Transcript

From Second Life Wiki
Jump to navigation Jump to search

Message Liberation SLDev Forum Tuesday 30 Oct., 2007 Zero Linden

Introduction

  • You: Hi, everyone, thank you very much for joining us this afternoon (in SLT). Zero Linden is one of the main contributors to Second Life architecture and is leading the Open Architecture Group effort on the Second Life future architecture. He's also been busy working on our messaging architecture.
  • Zero Linden: So - Message Liberation was a project that my studio (a group of engineers at Linden) spent about six months doing. It was a ton of work, and had almost no direct visible impact when it rolled out.
  • Stephen Psaltery: Those are the best kinds of features!
  • Zero Linden: The back story is this: We used to have a communications substrate, the "Message System" that was used for passing all messages around the system, between viewer and simulator, and between simulator and other back-end servers. This substrate had a terrible property: when you added or changed a message, the binary-on-the-wire format of other messages could change. Which meant that you had to do a FULL grid down and up - upgrade every system to the current version AND make every user download a new viewer anytime you changed a message.
  • Zero Linden: Message Liberation changed the encoding so that it no longer had this problem. However, the price of this was a significant restriction in what you could do when changing a message. At the same time, we added a new message pathway, one using TCP, HTTP and LLSD (an encoding of data) that was fully forward and backward extensible. If you moved a message to this new format, you could change it to your heart's content in the future and still have a forward and backward compatible format. This was rolled out beginning of July. And indeed, since then, we've had only one required viewer download, and that was for security reasons. Which means, I think, we succeeded.
  • Tommy Parrott: yes
  • Peter Newell claps
  • Marcoh Larsen nods
  • Rui Clary: YEEEEEE
  • Zero Linden: Most updates can now be done by rolling restart of simulators, and then offering an optional viewer to get the new feature.
  • You: Hence, no more 5 hours down every Wednesday
  • Peter Newell: meaning... you took away our Wednesday weekend :P
  • Zero Linden: Peter - you are the only person who misses 'em!
  • Zero Linden: So we still have whole grid downtimes - but those are due to operations issues we still don't know how to avoid (like some DB schema changes).

Discussion and Questions

Compatibility

  • Marcoh Larsen: How did you make the format forwards and backwards compatible? How did you design such a format?
  • Zero Linden: Well- it weren’t so easy (hold on - getting some props)
  • Zero Linden: Okay - can everyone see that? So - this is a little detailed, but don't worry about the details. In the old message system, the code sends and receives via UDP and binary packing. Stage one is: enable some messages to be encoded in LLSD. LLSD is a structured data system: Think arrays and maps and numbers and strings like in Perl or Python or PHP. We have LLSD support in C++, Perl, Python and PHP. You can build a data structure in any of those languages.
  • Marcoh Larsen: LLSD stands for?
  • Zero Linden: Linden Lab Structured Data
  • Zero Linden: Since arguments to messages are generally stored as a map, (key, value pairs), you can see that messages can have arguments added to them and still go to older code and newer code can detect when older code sends to it and a newer argument is missing. Of course, the programmer still has to do the work to figure out how to support older code but at least they are able to because the data encoding is taken care of.
  • Zero Linden: So here, in phase one, messages can be sent via either encoding and when received, are passed to old code, using the old API. At this stage we can send messages either way, but the code doesn't change.

Stage 2

  • Zero Linden: Stage two we add new messages or move messages in new code to using the new LLSD encoding natively.
  • Tommy Parrott: Will this result in an apparent slowdown? (The second processing)
  • Zero Linden: Well - you know, the cost of encoding and decoding is small in comparison to the processing of the message
  • Tommy Parrott: true
  • Zero Linden: So, no, not really much of a slow down. Besides, you'd be shocked to see how much work the binary encoding took.....
  • Marcoh Larsen: This was in viewer 1.17 or so?
  • Zero Linden: That encoding has been in for years; in 1.18 is where we went to what you see now.
  • Zero Linden: Notice that new messages and most extended messages will go via the newer LLSD and TCP/HTTP methods

Are the older messages still used?

  • Marcoh Larsen: Are the older messages still used in some places?
  • Zero Linden: For now, 90% of messages are still old style. There is no need to change them if there is no change to them
  • Marcoh Larsen: true
  • Zero Linden: We do, honestly, care about stability!
  • Marcoh Larsen: we know ;-)

Stage 3

  • Zero Linden: Now, of 99.9999% of all messages (might be exaggerating a bit there), the LLSD/TCP /HTTP is a better fit but, there are a few messages that UDP is probably right; for those, we'll keep the old system. BUT, if we need, we can use the as of yet unimplemented stage 3. And in this stage newer code can bridge the encoding back to the old style, which enables us keeping those messages in binary/UDP. So far, we've had no need to do this. So we are here today, and that's message liberation.

What was the scope of this project?

  • Marcoh Larsen: How many man-hours were put into this? approx?
  • Zero Linden: About four engineers for six months. It was quite a hole to have to dig ourselves out of and there are so many combinations to test.
  • Peter Newell: Did they get a big party when it launched successfully? ^^
  • Zero Linden: We got lots of love. What was hard was that there was quite a bit of code that had violated the interface of the message system - a "leaky abstraction" if you will - and so that code was somewhat dependent on the implementation of the message system and the specifics of how it was implemented, so we needed to recode quite a number of messages in some cases, and in others, replicate quirks that the old system had.
  • Peter Newell: Awesome, well thanks for the further detail
  • Zero Linden: Are there other questions? and I'm happy to talk on other topics of SL architecture.

How is voice implemented?

  • Marcoh Larsen: Thanks for this explanation. How is voice implemented?
  • Zero Linden: slinky-network. Actually- it is a fairly traditional voice-over-IP affair - done by an outside vendor. It runs on their servers. The only real trick is that the regions have to feed position data to the voice servers for localization. Well, actually I think the viewers feed the data and the sims authenticate that you're actually in the region and give information about parcels (for when there are private voice channels in a parcel) but really, it is kind of a standard VOIP system beyond that.
  • Marcoh Larsen: So no extra traffic through the SL-servers ;-)
  • Zero Linden: Not much - just the extra data telling the voice servers who is there - the voice never hits our network.
  • Marcoh Larsen: A bit like streaming audio/video...
  • Zero Linden: yup
  • Marcoh Larsen: Thanks, never knew how it worked actually.

Effect of Message Liberation on the viewer

  • You: Zero, does Message Liberation directly affect what people can do with the OS viewer?
  • Zero Linden: Well - it actually makes it so that they don't need to grab the latest source and re-integrate every time we change our viewer so it is a general win for OS viewers because the message system now will support an older viewer. Of course - if there are security issues, that might cause us to release a viewer that is a required upgrade, though I'm pretty sure that old OS viewers can get in unless we need to add required new handshake but in general, it is more stable now due to message liberation.
  • Gigs Taggart: You have to change the version number the older viewer reports I believe so that it pretends to be new enough.
  • Zero Linden: Probably. I think there is a way you can declare "I'm a variant viewer, don't check my version," but I'm not sure what the status of that feature is right now.
  • Gigs Taggart: This is out of my league, but I think you can just set the version to some really high number and you'll be ok then.
  • Marcoh Larsen: There must be for the OS-viewers...

Why VPN between LL data centers

  • Marcoh Larsen: The servers are in two separate cybercenters ... Why is there a VPN-connection between them when all the other data goes simply over the Internet?
  • Zero Linden: That's good! Well, right now like many Internet applications, there is basically universal trust between the various servers that make up the backend. So, when the simulator talks to the database, or to the presence servers all of that is in an environment of trust between pieces. Now, currently there are servers that are used by all simulators in the grid: for example, the inventory databases. So, any region in the grid might need to be able to talk to any inventory database and so, at this point we keep all those databases in SF and the sims in Dallas need to talk to them. But we don’t want to run that traffic, a mysql TCP connection, over the public network. So, those things go over the encrypted tunnel. In the future - when more of our backend is connected to itself via HTTP and capabilities, we can put the cross-data-center traffic over HTTPS and let it fly over the public Internet which will be a big win.
  • Marcoh Larsen: Opening up new datacenters will become a lot easier then.

VPN traffic

  • Gigs Taggart: How many mbits/sec are you pushing over that VPN?
  • Zero Linden: I don't have the bandwidth number, but let's say it is an issue!
  • Zero Linden: Yes - it would be a BIG win -- did I mention that?

Close

  • Kim Anubis: What are you working on next, Zero?
  • Zero Linden: Next? Opening up the Second Life Grid through open protocols to allow an internet-scale Second Life
  • Kim Anubis: Yay!
  • Zero Linden: Oh, and then after that, making pancakes.
  • Kim Anubis: I'll bring syrup hehe
  • Peter Newell: mmmm
  • Peter Newell: pancakes
  • Tommy Parrott: Hmmmm.buttermilk pancakes....
  • Zero Linden: You're on, Kim
  • Tommy Parrott drools
  • Kim Anubis: :)
  • Zero Linden: Well - thanks for having me. You may or may not know, I have office hours twice a week, and we discuss a range of architectural issues of SL Office_Hours#Zero_Linden
  • You: Zero, thank you very much for joining us today and taking time to explain what Message Liberation is and what it means for Second Life. And, thank all of you for coming today!