User:Which Linden/Office Hours/2008 Feb 21
Jump to navigation
Jump to search
- [11:05] Which Linden: good morning!
- [11:05] Morgaine Dinova: Woohoo, hiya Which!
- [11:05] Zha Ewry: Inbound Bamboo!
- [11:05] Which Linden: long time since I've seen you folks!
- [11:06] Morgaine Dinova: Hehe, reminds me of the call on pulling mobs in EverQuest, "Incoming bamboo, one per message!"
- [11:07] Which Linden: So what's new?
- [11:07] Morgaine Dinova: (You'd set up a function key to issue that message with <target> set to the targetted mob, and hit it as many times as the number of mobs you pulled :-)
- [11:08] Morgaine Dinova: Not much new, but we had a good discussion at Zero's.
- [11:08] Which Linden: What about?
- [11:10] Morgaine Dinova: Various topics, but an intersting one was serialization of sim data and the state transitions the sim would go through to get it saved when the sim was going down.
- [11:10] Which Linden: Ha ha, cool
- [11:10] Morgaine Dinova: Quite complex.
- [11:11] Morgaine Dinova: Sounds like a ton of error handling code there, although we didn't go into that.
- [11:11] Which Linden: I saw an interesting talk from Cryptic on their new MMO, where they push state live to a database
- [11:11] Morgaine Dinova: Ah, no
- [11:11] Morgaine Dinova: Cryptic is a Linden?
- [11:12] Which Linden: No, they are a game developer, that made City of Heroes
- [11:12] Which Linden: It was just interesting to see how they solved a similar problem
- [11:13] Which Linden: Predictably, the biggest problem they had was that their database wasn't fast enough
- [11:13] Morgaine Dinova: Oh, that's ArenaNet. They're an amazing outfit The do Guild Wars too, and that game has superb scalability and a almost total absense of downtime. Very impressive
- [11:13] Which Linden: That is impressive
- [11:14] Which Linden: One of the interesting takeaways of the talk: "players care much more about availability than data integrity"
- [11:15] Morgaine Dinova: Well the real rown jewels is to avoid that being an either/or.
- [11:15] Morgaine Dinova: crown*
- [11:15] Which Linden: Yeah
- [11:15] Which Linden: But enough about other companies, want to finish the discussion about short-term CHTTP?
- [11:16] Morgaine Dinova: No harm taking the best from other companies. I'm sure they're eyeing the best of SL too :-)
- [11:16] Which Linden: Oh yah, for sure, but it's all proprietary. The best you can get is a hand-wavey description (even from us)
- [11:17] Morgaine Dinova: Yep. What aspects haven't been covered?
- [11:17] Morgaine Dinova: How about scalability of the mechanism? Don't think we've looked at that.
- [11:17] Which Linden: Yes, that's important
- [11:17] Which Linden: It should be more scalable than regular CHTTP
- [11:18] Which Linden: I.e. be able to handle more requests per second
- [11:18] Saijanai Kuhn: sorta back
- [11:18] Which Linden: Hey Sai
- [11:18] Which Linden: I think that the only way to inprove the scalability of chttp is to avoid persisting anything to disk.
- [11:19] Morgaine Dinova: So, let's sketch out the dimensions: for starters, number of concurrent disjointtransaction, number of participant objects, number of participant agents ... what else?
- [11:19] Which Linden: What do you mean by participant objects?
- [11:19] Morgaine Dinova: Items involved in the atomic transfer. It's not just money.
- [11:20] Morgaine Dinova: It could be a barter operation, 3 doughnuts in exchange for 5 geese and a vibrator :-)
- [11:22] Which Linden: Sorry I was gone for a sec there, someone came up to me in person with an urgent request
- [11:23] Morgaine Dinova: They have no shame!!!! :P
- [11:23] Which Linden: For sure
- [11:23] Which Linden: So, OK, well, for the CHTTP stuff I'm scoping it down to a single message
- [11:24] Morgaine Dinova: The message carries a UUID set though, detailing the items involved in the transaction, right?
- [11:24] Which Linden: Well, it could, or it could just be a message with no transactions at all
- [11:24] Which Linden: Er, well, I guess a message is a transaction
- [11:25] Which Linden: Hm, I think I see what you mean
- [11:25] Which Linden: Let me see if I can put it in my own terms.
- [11:25] Morgaine Dinova: Yes, that's the degenrate case, ie. just an atomic single item transfer, the atomicity simply ensuring that the giver doesn't retain a copy if the recipient got it.
- [11:26] Zha Ewry: lurks listening but mostly in other windows
- [11:26] Which Linden: By its very nature, chttp is a transaction engine, though the transactions might be pretty small
- [11:26] Which Linden: I.e., "flip this bit"
- [11:28] Which Linden: We ran a profile on chttp as it is, and for one received message it makes 18 queries and 5 commits
- [11:28] Which Linden: That is a lot.
- [11:28] Morgaine Dinova: If the mechanism is parametrized by a function that details what the actual work to be done is, then yes, it could be just "flip a bit and ensure all partcipants know it was flipped (or failed) and do it only once no matter how many times it was requested".
- [11:28] Which Linden: Well, so here's the distinction between the escrow and chttp
- [11:29] Which Linden: The escrow coordinates between an arbitrary number of participants
- [11:29] Which Linden: Chttp coordinates between exactly two
- [11:30] Which Linden: So, the most you can do with a single chttp message is tell a host "perform this transaction on yourself"
- [11:30] Morgaine Dinova: Well the scalabaility concerns of the two differ slightly then, since the former is layered on the latter.
- [11:31] Which Linden: Yes, exactly
- [11:31] Morgaine Dinova: Which is fine
- [11:32] Which Linden: Yeah, so, basically, right now chttp is so expensive that you wouldn't want to expose a chttp server directly to the public internet
- [11:33] Which Linden: Well, you could, and it'd probably be fine, but you'd be more susceptible to DoS than a regular http server
- [11:33] Which Linden: (depending on your database implementation)
- [11:34] Morgaine Dinova: Re the profiling ... I don't like the sound of that. Shouldn't the process for chttp (unlike escrow) entail two queries (one per participant) and only one single final commit to seal or fail the transaction?
- [11:35] Which Linden: Absolutely, it could be cut down
- [11:35] Which Linden: I think the client still has to make 2 commits no matter what
- [11:35] Which Linden: One commit for the outgoing message, another for the received response.
- [11:35] Morgaine Dinova: Escrow adds one query per participant agent and object to that, but still only one commit. It's in core until the reqs are all satisfied, and then Big bang commit.
- [11:36] Which Linden: And the server has 2 as well, actually, one for the actual transaction and another for the tombstoning
- [11:36] Which Linden: Though we could just skip tombstoning
- [11:37] Morgaine Dinova: When you say "client", you mean c-s within the grid, right? Not user's client.
- [11:37] Which Linden: Yeah
- [11:37] Which Linden: The client from the perspective of the chttp message
- [11:37] Morgaine Dinova: Yeah
- [11:39] Which Linden: So, yeah, while developing chttp we were a bit wasteful on the commits.
- [11:40] Morgaine Dinova: I think I need to revisit the escrow logic, because of the commits. There should be only a single commit to seal the fulfillment of escrow, and I don't see how that tallies with it being layered on top of a chttp that does its own commits.
- [11:40] Which Linden: Each worklog.persist() call translates to a commit
- [11:41] Which Linden: Each creation of a worklog translates to like 3 commits (!)
- [11:41] Which Linden: The client, making an outgoing request, calls worklog.persist() twice
- [11:41] Morgaine Dinova: Any persist() that gets undone on escrow failure is a wasted commit --- it shouldn't be needed.
- [11:42] Which Linden: Undone in the sense that the undo() method is called?
- [11:42] Morgaine Dinova: Yes, assuming that's what you call if all the requirements weren't met.
- [11:43] Which Linden: Yeah. Well, we need those commits to preserve deterministic replay
- [11:43] Morgaine Dinova: thinks
- [11:43] Which Linden: We could write the escrow differently so that it could get by without that, but I considered that readability and maintainability trumped perfect efficiency
- [11:43] Morgaine Dinova: I don't think that you NEED deterministic replay if the escrow fails.
- [11:44] Morgaine Dinova: It simply never happened. Nothing got transferred.
- [11:44] Which Linden: Yes, I guess that's true, you could just stick a flag on the beginning
- [11:44] Which Linden: You need deterministic replay until the undo is finished though
- [11:44] Zha Ewry: Gotta run
- [11:44] Zha Ewry: Laters people
- [11:45] Which Linden: I.e. if you crash partway through an undo you want to be able to pick that up
- [11:45] Which Linden: Later Zha, thanks for stopping by!
- [11:45] Morgaine Dinova: In fact, you may be storing up a can of worms if you employ deterministic replay, because it has to be reliable, so it's another point of failure.
- [11:45] Morgaine Dinova: Cyu later Zha
- [11:45] Which Linden: How is it an additional point of failure?
- [11:46] Which Linden: I agree that it's possible to write code that doesn't play well with deterministic replay
- [11:46] Which Linden: But bad code is bad code, no framework will protect you from that
- [11:46] Morgaine Dinova: The replay list has to be preserved, and executed.
- [11:46] Which Linden: Yup
- [11:46] Morgaine Dinova: And until it's executed, the resources it's holding have disappeared.
- [11:47] Which Linden: Um, I guess
- [11:47] Which Linden: Yea
- [11:47] Saijanai Kuhn: bamibino's mom is sick picking her up from work
- [11:47] Saijanai Kuhn: Later all
- [11:47] Which Linden: k, take it easy, sai
- [11:48] Morgaine Dinova: Whereas if you invert the logic and keep the resources in situ but marked with transaction-pending status, then you don't need anything except garbage collection to clean it up.
- [11:48] Morgaine Dinova: Cyu Sai, take care
- [11:49] Which Linden: There's nothing about the escrow that prevents the in-situ implementation
- [11:49] Which Linden: For some things, that will make sense
- [11:49] Which Linden: For others, it won't
- [11:49] Which Linden: I guess I've just been thinking about that as an implementation detail
- [11:50] Morgaine Dinova: Sure, but I'm worried about the commits because we started to talk about scalability, so lots of commits per escrow == bad :-)
- [11:50] Which Linden: Oh, yeah, well, commits == performance, not scalability
- [11:51] Which Linden: I.e. it's already scalable cause you can just throw more boxes at it to alleviate the problem of too many commits
- [11:52] Which Linden: The line between performance and scalability is always blurry though
- [11:52] Morgaine Dinova: Time per overall transaction is inversely proportional to scalability per escrow server. So it state size, but at least memory+disk is growing in size fast.
- [11:52] Morgaine Dinova: So is*
- [11:53] Which Linden: Yeah, that's true
- [11:53] Which Linden: Currently it's not horrible, but we can definitely take it further in that regard
- [11:53] Morgaine Dinova: Whereas they tell us that cores aren't going to get quicker, and the growth in number of cores isn't all that fast,
- [11:53] Morgaine Dinova: So really minimizing the time per overall transaction is the key to scalability in this case.
- [11:54] Which Linden: I believe that this is mostly a disk i/o issue, actually.
- [11:54] Morgaine Dinova: In one dimension anyway.
- [11:54] Which Linden: Each commit hits the disk, so you're gated on the speed of the disk
- [11:54] Morgaine Dinova: There's another issue here.
- [11:57] Which Linden: This is gonna be an essay!
- [11:57] Which Linden: :-)
- [11:57] Morgaine Dinova: The commits within the overall transaction are not(you can't split the relevant tables), whereas the commits from different escrows are entirely disjoint. This means that your scalability is not badly limited by disk across a large number of escrows, but severely limited within any given escrow server.
- [11:57] Morgaine Dinova: i only do essays ;-)
- [11:57] Morgaine Dinova: And theses :-)
- [11:58] Which Linden: Yes, I agree with what you said
- [11:58] Morgaine Dinova: s/are not/are not scalable/
- [11:58] Morgaine Dinova: Sorry
- [11:58] Which Linden: I understood it. :-)
- [11:58] Morgaine Dinova: s/are not/are not disjoint/
- [11:58] Morgaine Dinova: Ah cool :-)
- [11:59] Which Linden: This little chat box is not ideal for composing quantities of text.
- [11:59] Morgaine Dinova: Aye
- [11:59] Which Linden: Are you saying that you're interested in improving the per-transaction performance/scalability?
- [12:00] Morgaine Dinova: And the Linux client seems to have a very odd hesitancy in the text box, which the Windows one doesn't. Odd
- [12:00] Which Linden: Could be a framerate issue
- [12:01] Morgaine Dinova: Only 16 FPS here, yeah. It's your bamboo :-)
- [12:02] Which Linden: Runitai has hassled me about the bamboo, I'm well aware.
- [12:02] Which Linden: But I'm too lazy to do anything about it yet. :-)
- [12:03] Morgaine Dinova: Yes, scalability is my main interest in AWG, and applies here too. I've always been interested in it, from my multiprocessing speedup research in the old days, through to scaling up ISP platforms to millions. I like the topic.
- [12:03] Which Linden: It does have a lot of depth. :-)
- [12:03] Morgaine Dinova: And potential
- [12:03] Which Linden: Yeha, you're always dealing with concurrent programming and emergent behavior
- [12:04] Morgaine Dinova: Aye
- [12:04] Which Linden: I guess it's time for me to wrap up and head in to debug some more brokenness
- [12:04] Morgaine Dinova: And you can't be sloppy with concurrent programming. It will make you pay, in blood, every time.
- [12:05] Morgaine Dinova: Hehe, time for your day job :-)
- [12:05] Which Linden: Ha ha, yeah
- [12:05] Which Linden: Of paying in blood
- [12:05] Which Linden: Incidentally, I'll be at pycon this year
- [12:05] Morgaine Dinova: Good talks scheduled?
- [12:05] Which Linden: Looks like it
- [12:05] Which Linden: Lots about asynchronous network programming
- [12:06] Morgaine Dinova: Do they video the talks for Youtube?
- [12:06] Which Linden: No idea
- [12:06] Morgaine Dinova: Hope so
- [12:06] Which Linden: Me too
- [12:06] Morgaine Dinova: When is it?
- [12:07] Which Linden: March, uh, 14th
- [12:07] Which Linden: In Chicago
- [12:07] Morgaine Dinova: Oh, big trip for you
- [12:07] Which Linden: Oh, and next week my office hour will start a half hour late
- [12:08] Morgaine Dinova: kk
- [12:08] Which Linden: I'll post that on my wiki and put up a sign here
- [12:08] Morgaine Dinova: Good idea
- [12:08] Which Linden: I enjoyed talking with you today
- [12:08] Which Linden: Hopefully we'll talk again soon!
- [12:08] Which Linden: :-)
- [12:09] Morgaine Dinova: Me too, your talks are always good, hehe