Beta Server Office Hours/Minutes/2011-07-07

From Second Life Wiki
Jump to: navigation, search

This meeting was held on July 07th, 2011

Agenda

  • Fairly slow week this week in regard to releases.

Updates

  • Second Life Server (main channel)
    • This has Kelly's "mono2-performance" branch.
  • BlueSteel RC Channel
    • "maint-server" project in this slot.
    • Same as last week with a merge from server trunk.


Upcoming Stuff

Interesting Stuff


Transcript

Transcript of Oskar Linden's Beta Server Office Hour for July 07th, 2011:

  • [15:02] Oskar Linden: Good Thursday to you all
  • [15:02] Oskar Linden: we'll wait a few minutes
  • [15:03] Monty Linden: someone did some redesign work here
  • [15:03] Fancy Greeter: Caleb Linden has arrived! (Or, returned?)
  • [15:03] Fancy Greeter: Monty Linden has arrived! (Or, returned?)
  • [15:03] Fancy Greeter: Coyot Linden has arrived! (Or, returned?)
  • [15:03] Fancy Greeter: Oskar Linden has arrived! (Or, returned?)
  • [15:03] Fancy Greeter: Maestro Linden has arrived! (Or, returned?)
  • [15:03] Oskar Linden: yeah I tweaked things a bit
  • [15:03] Eddi Decosta: LOL, Maestro
  • [15:03] Oskar Linden: you guys have a nice week?
  • [15:03] Flip Idlemind: Good news guize, Im here. You can all stop worrying
  • [15:03] Eddi Decosta: hi Monty and Latif
  • [15:04] Maestro Linden: Hi guys
  • [15:04] Latif Khalifa: hello :)
  • [15:04] Monty Linden: hello again
  • [15:04] Rex Cronon: greetings everybody
  • [15:04] Coyot LindenCoyot: Linden has discovered that Pandora is very helpful for Beta OH
  • [15:04] Oskar Linden: well here is the agenda
  • [15:04] Oskar Linden: [1]
  • [15:05] Oskar Linden: not a very busy week week release-wise
  • [15:05] Oskar Linden: since we had a holiday on the 4th that limited our release options
  • [15:06] Coyot Linden: blame coyot for that
  • [15:06] Oskar Linden: and that
  • [15:06] Flip Idlemind: Its ok, you took a break from server deploys, like the founding fathers wanted
  • [15:06] Coyot Linden: ha
  • [15:06] Oskar Linden: so no main channel release
  • [15:07] Oskar Linden: coyot and I both had taken friday and tuesday off
  • [15:07] Oskar Linden: so the lab was a tad barren
  • [15:07] Monty Linden: unrelated but there were no disasters on Friday and Tuesday
  • [15:07] Coyot Linden: \o/
  • [15:07] Oskar Linden: well
  • [15:08] Latif Khalifa: i thought the grid was unusually smooth :P
  • [15:08] Oskar Linden: we did have increased crash rates and reports of poor performance from some of the more vocal customers
  • [15:08] Oskar Linden: if latif says it was smooth I know we broke something
  • [15:08] Coyot Linden: lol
  • [15:08] Latif Khalifa: the script engine got slower unfortunatelly
  • [15:09] Oskar Linden: kelly was gonna be here to talk about that
  • [15:09] Latif Khalifa: dunno if kelly would want to hear it ;)
  • [15:10] Fancy Greeter: Kelly Linden has arrived! (Or, returned?)
  • [15:10] Maestro Linden: do you juice, Latif? :)
  • [15:10] Oskar Linden: the issues reported didn't warrant a rollback
  • [15:10] Latif Khalifa: speaking of the devil :)
  • [15:10] Oskar Linden: we have a fairly strict set of criteria for handling and defining emergencies
  • [15:10] Latif Khalifa: Kelly, the script scheduler seems to have suffered a perfmance hit with mono2 release
  • [15:10] Kelly Linden: Filed a jira for it Latif?
  • [15:11] Latif Khalifa: so on regions with no spare time scripts get less of a time slice to run
  • [15:11] Oskar Linden: I know some of you might disgree with how and when we implement rollbacks, but the situations we saw last week didn't warrant one
  • [15:11] Lares Carter: heya everyone =]
  • [15:11] Coyot Linden: This promotes grid stability overall
  • [15:12] Rex Cronon: hey
  • [15:12] Gooden Uggla: increased crash rates means stability?
  • [15:12] Kelly Linden: There is nothing that lowers the timeslice size per script based on how many scripts are in the region.
  • [15:12] Oskar Linden: talking about main channel
  • [15:12] Oskar Linden: there are multiple issues here
  • [15:12] Coyot Linden: No, not rolling the grid at every possible sign of a problem.
  • [15:12] Oskar Linden: the rc channels with the increased crash rates were rolled back
  • [15:13] Latif Khalifa: Kelly, there are several JIRA-s about how idle scripts affect spare time. But people misudnerstand what it is about so perhaps I will just make a new one. In short, in a region with many scripts, a script will get about half of the running time it had before mono2.
  • [15:14] Kelly Linden: I'll need more details and a jira is the best place for it.
  • [15:14] Latif Khalifa: it would be helpful to setup some test here :)
  • [15:14] Oskar Linden: once you make the jira latif put a post in the forums and get some more input on it
  • [15:14] Kelly Linden: including how you are determining how much running time it gets
  • [15:14] Oskar Linden: what do you need latif?
  • [15:14] Maestro Linden: we could throw an empty region on an old-trunk server
  • [15:15] Maestro Linden: from the week before the mono2-performance merge
  • [15:15] Latif Khalifa: Maestro, the point is that it does not affect regions with spare time. What you need is a regin with say 8000 active scripts
  • [15:15] Gooden Uggla: or, you know... regions that host events with 20 people...
  • [15:16] Eddi Decosta: i need to start a web jira, thanks to let me think of that
  • [15:16] Oskar Linden: time for an aditi house party?
  • [15:16] Coyot Linden: shweeeeeeet
  • [15:16] Oskar Linden: we can import goodens club here
  • [15:16] Flip Idlemind: Well this user group is arguably an event with 20 people
  • [15:17] Gooden Uggla: i'm a big fan of irony, so i love the fact that the "fix" to sim freezes (so we can have larger events) actually makes events perform more poorly than before
  • [15:17] Oskar Linden: pack it full of beta testers
  • [15:17] Eddi Decosta: party with Coffee?
  • [15:17] Gooden Uggla: if you like, i'll have DJ's and dancers anytime you want
  • [15:17] Gooden Uggla: anything to get this fixed
  • [15:17] Latif Khalifa: I will make a JIRA and include script that determines the time slice it's getting. It basically compares time it gets on a region with spare time.
  • [15:17] Liisa Runo: i have noticed that recently when sandbox is full of phys cubes. Scripts completely stop. I agree that physics should have priority over scripts. But maybe it could be tuned so that only pysical agents get priority while physical objects dont completely override scripts. Used to be so that scripts got some slices even under the heavie'st phys cube attack. But recently they get only one slice in 10minutes.
  • [15:18] Rex Cronon: and all that just for testing sim perfomance? haha
  • [15:18] Kelly Linden: scripts should always get 2.5ms on full regions.
  • [15:19] Gooden Uggla: um kelly... not so much...
  • [15:19] Gooden Uggla: not anymore
  • [15:19] Gooden Uggla: put 20 people on a danceball and look
  • [15:20] Kelly Linden: Sorry I don't think I was clear. The minimum block of time we will reserve for running scripts in a single frame is 2.25ms.
  • [15:20] Eddi Decosta: Kelly, do you know if the script could be limited by land size or not yet?
  • [15:20] Kelly Linden: even if it takes more than a full frame to do just physics.
  • [15:21] Maestro Linden: Eddi, you mean script memory limits?
  • [15:21] Homeless: how: do you measure how much time a script is getting?
  • [15:21] Kelly Linden: monster: it would be a medium size project to implement. Not impossible, not exactly easy
  • [15:21] Eddi Decosta: Maestro; yeah, to be sure i cant put like 46 horse on a 2048sqm :p
  • [15:22] Leonel Iceghost: that would be nice
  • [15:22] Kelly Linden: oh, memory is different, thought you were talking script time.
  • [15:22] Kelly Linden: Memory is not that hard, but comes up to design needs as continuing what was designed previously seems like a bad idea for living next to the pending mesh resource system.
  • [15:23] Gooden Uggla: kelly... we understand this is going to be a long painful process, mono will probably never be fixed well enough to be considered a success in SL, but performance is so crappy now that everyone has noticed *something* that isn't working correctly
  • [15:24] Gooden Uggla: literally everyone
  • [15:24] Kelly Linden: gooden: please file or comment on jiras like SVC-7079 or SVC-7084
  • [15:24] Flame of: Jira: [#SVC-7079
  • [15:24] Flame of: Jira: [#SVC-7084
  • [15:24] Kelly Linden: Some of the people with the biggest issues that persisted through this week were having issues unrelated to the release that needed support help.
  • [15:25] Latif Khalifa: Maestro, in order to demonstrate the problem that I'm talking about it would be useful to have a region with pre-mono2 if it is possible? Then we just copy the default script in a bunch of objects, like 8000 of default scripts and compare the script performance between the two. All other scripts except the benchmark script don't have to do anything
  • [15:25] Kelly Linden: Latif: email me your benchmark script please, or hand it to me.
  • [15:25] Fancy Greeter: Caleb Linden has arrived! (Or, returned?)
  • [15:26] Latif Khalifa: OK
  • [15:26] Latif Khalifa: SCR-120
  • [15:26] Flame of: Jira: [#SCR-120
  • [15:27] Maestro Linden: well, that seems fairly easy to setup
  • [15:27] Maestro Linden: I'd rather have a controlled environment rather than a random region with scripts doing random unknown stuff
  • [15:27] Latif Khalifa: 8000 scripts that do nothing will demostrate nicely
  • [15:28] Maestro Linden: yeah sounds good
  • [15:28] Latif Khalifa: + 1 running script with test
  • [15:28] Oskar Linden: I can give you permissions out on one of the oatmeal latif
  • [15:28] Oskar Linden: msg me after this and we'll work something out
  • [15:29] Latif Khalifa: ok
  • [15:29] Fancy Greeter: Caleb Linden has arrived! (Or, returned?)
  • [15:29] Rex Cronon: having an controlled environment kind of takes away from the realism
  • [15:29] Oskar Linden: it can
  • [15:30] Oskar Linden: but we have to have controlled environments to narrow down issues
  • [15:30] Oskar Linden: we have to remove as many variables as possible and create the simplest of repro environments to be effective
  • [15:30] Latif Khalifa: well if it shows that on mono2 1 running script gets about half the time it got on pre-pno2 on a region with 8000 idle scripts that would show the poblem nicely
  • [15:30] Oskar Linden: i hope it will
  • [15:30] Maestro Linden: well, there are some other tests which use 'real' regions as well, and look at things like SimFPS and rough script performance
  • [15:32] Kelly Linden: Lets also be sure to test on the DRTSIM-71 code from this week.
  • [15:32] Oskar Linden: good idea
  • [15:33] Maestro Linden: (DRTSIM-71 just includes kelly's fix for SVC-7079 )
  • [15:33] Flame of: Jira: [#SVC-7079
  • [15:33] Oskar Linden: alright
  • [15:33] Oskar Linden: next up this week was the rollback of letigre and magnum
  • [15:33] Oskar Linden: that went pretty smooth
  • [15:33] Object: Hello,: Avatar!
  • [15:33] Oskar Linden: it's hard to have events like that take palce over long weekends
  • [15:34] Oskar Linden: we also don't notice larger crashing trends for at least a few days after a release
  • [15:35] Oskar Linden: by the time we knew letigre and magnum were crashing at a larger rate no one was around to see it except support
  • [15:35] Kallista Destiny: Poor suport
  • [15:35] Gooden Uggla: is anyone going to look into the issues with meeroos?
  • [15:35] Oskar Linden: one of the fun paradigms of having a 24.7 virtual world hitting a regular work week with holidays and 9-5 employees
  • [15:35] Homeless: maybe: releases should be avaoided before long weekends
  • [15:36] Coyot Linden: YES
  • [15:36] Oskar Linden: yes and no homeless
  • [15:36] Kallista Destiny: lols
  • [15:36] Monty Linden: 9-5 employees? Not in boston....
  • [15:37] Object: Hello,: Avatar!
  • [15:37] Oskar Linden: now that monty has spoekn up
  • [15:37] Oskar Linden: he has something to diuscuss
  • [15:37] Oskar Linden: and it involves this graph: [2]
  • [15:37] Monty Linden: *cough*
  • [15:38] Monty Linden: This is about the magical svc-6760 that nobody can look at.
  • [15:38] Flame of: Jira:
  • Login Required
  • https://jira.secondlife.com/browse/SVC-6760
  • [15:38] Object: Hello,: Avatar!
  • [15:38] Kallista Destiny: sighs
  • [15:38] Monty Linden: What I'm changing is how we deploy web services on the simhosts.
  • [15:38] Monty Linden: The goal being to *stop* that kind of behavior seen in the .png
  • [15:39] Monty Linden: What you experience in the viewer is the middle chart.
  • [15:39] Gooden Uggla: lovely, are trans history going to get fixed?
  • [15:39] Monty Linden: The Cap is specifically the GetTexture cap for HTTP Textures and
  • [15:39] Monty Linden: the mean response times run up to 30 seconds and beyond
  • [15:39] Latif Khalifa: wooptie, mono scheduler benchmark in LSL :P
  • [15:40] Monty Linden: Maestro *just* finished QA and we're going to put this on Aditi in some
  • [15:40] Monty Linden: degree asap.
  • [15:40] Monty Linden: I don't know if anyone is able to recreate the kind of asset loading, TP, etc.
  • [15:41] Monty Linden: problems experienced on Agni here on Aditi but if so, we'd like to see your
  • [15:41] Monty Linden: regions here. Massive texture downloads, etc.
  • [15:41] Oskar Linden: so what does that graph show?
  • [15:41] Kallista Destiny: Need a pretty complex region
  • [15:41] Latif Khalifa: where is maestro ll
  • [15:41] Monty Linden: The middle graph shows minimum, mean and max GetTexture response time
  • [15:42] Monty Linden: in 30-second intervals during 'an event'.
  • [15:42] Kallista Destiny: Sardar is on Aditi, it's moderatly complex graphically epecally the market
  • [15:42] Monty Linden: Normal times average well under 1 second, this goes into chaotic failure and runs
  • [15:42] Monty Linden: over 30 seconds.
  • [15:42] Rex Cronon: so, how come i still see pixelated textures minutes later?
  • [15:42] Oskar Linden: so what does that graph show?
  • [15:42] Monty Linden: Top graph reflects the underlying and internal service.
  • [15:43] Monty Linden: The delays that accumulate between the Cap request which you see and what it takes to process.
  • [15:43] Monty Linden: Shows that the trouble lies between the two services.
  • [15:43] Monty Linden: The bottom graph shows discrete problems in our service stack.
  • [15:43] Monty Linden: Throttles are the Caps throttles which viewers do experience
  • [15:44] Monty Linden: Timeouts are internal problems reaching services.
  • [15:44] Monty Linden: Those are my 'canary in the mine'.
  • [15:44] Kallista Destiny: It appears that they happen after the first delay?
  • [15:45] Monty Linden: Not exactly. Chart plots request time.
  • [15:45] Monty Linden: So you need another 30 seconds before you get a timeout anyway
  • [15:45] Monty Linden: But, yeah, there is a transitional phase.
  • [15:45] Kallista Destiny: Oh timeout is noted AFTER it occurs
  • [15:46] Monty Linden: I actually think that entire prolog section is showing symptoms of going
  • [15:46] Monty Linden: chaotic. At the very end you see the services come back into stable
  • [15:46] Monty Linden: configuration and there are few/no outliers.
  • [15:46] Monty Linden: Services become very predictable.
  • [15:46] Monty Linden: With this said, there is more work to do.
  • [15:47] Homeless: What: action would I be doing when this occurs? confused.
  • [15:47] Monty Linden: Viewer is getting attention and you can follow *that* in
  • [15:47] Monty Linden: VWR-25145
  • [15:47] Flame of: Jira: [#VWR-25145
  • [15:47] Monty Linden: yeah, that one.
  • [15:47] Kelly Linden: oh god. the llabs benchmark
  • [15:47] Monty Linden: homeless: anything - you could be the victim of someone else
  • [15:48] Monty Linden: I'm using GetTexture as it has an interesting structure.
  • [15:48] Monty Linden: But TP's, Mesh stuff, inventory, people information
  • [15:48] Monty Linden: so much goes through Caps
  • [15:48] Monty Linden: all can be affected
  • [15:49] Monty Linden: that's it from me, did I miss any q's?
  • [15:49] Oskar Linden: so what does that graph show?
  • [15:49] Oskar Linden: X-D
  • [15:49] Monty Linden: IM'd a smack to you
  • [15:49] Homeless: oh: ok.. wandering around aimlessly would do it
  • [15:49] Oskar Linden: smack received
  • [15:49] Gooden Uggla: ok, now that you can measure it, how do you propose to fix it?
  • [15:50] Oskar Linden: next week BlueSteel is likely getting promoted and that has fixes for the immortal prims
  • [15:50] Monty Linden: I've changed our stack structure and that's what will show up here asap
  • [15:50] Gooden Uggla: ok
  • [15:50] Monty Linden: The short answer is that we run, depending on how you count, 5-14 tiers of
  • [15:50] Monty Linden: services and we run them in too few physical tiers.
  • [15:50] Monty Linden: I'm increasing the physical tiers.
  • [15:51] Monty Linden: More than that is *sekrit*
  • [15:51] Monty Linden:  :)
  • [15:51] Moundsa Mayo: So trading off a little more latency between tiers for much less contention within processors.
  • [15:51] Monty Linden: It would actually be a really nice technical Blog post
  • [15:52] Monty Linden: Not simple contention, really. Think 'semaphore'
  • [15:52] Coyot Linden: p(v)
  • [15:52] Gooden Uggla: we'll all be interested to see how this works, it's been an issue for years
  • [15:52] Python Morales: oops, missed the meeting guess :p
  • [15:52] Moundsa Mayo: Tiers of contention, but I get your point. B^)
  • [15:53] Monty Linden: So, sign up and beat up the grid.
  • [15:53] Moundsa Mayo: Any actual deadly embraces involved currently?
  • [15:53] Monty Linden: No
  • [15:53] Monty Linden: But there's a thing I haven't looked at yet
  • [15:53] Monty Linden: and I don't like it.
  • [15:54] Moundsa Mayo: Further, deponent sayeth not B^D
  • [15:54] Monty Linden: My hope is that this is a major part of the "blame the load" excuse.
  • [15:54] Kallista Destiny: Yes? engquireing minds want to know.
  • [15:54] Monty Linden: From here, we can get better and better.
  • [15:54] Gooden Uggla: good to hear
  • [15:55] Oskar Linden: maestro, coyot, anything else?
  • [15:55] Maestro Linden: nothing comes to mind
  • [15:55] Coyot Linden: We will be arrested for transporting gulls over sedate lions for immortal porpoises.
  • [15:55] Coyot Linden: I mean, immortal prims
  • [15:56] Kallista Destiny: Boooooo, no Shaggy dong stories
  • [15:56] Coyot Linden: ha
  • [15:56] Kallista Destiny: dog
  • [15:56] Coyot Linden: shaggy coyot, maybe
  • [15:56] Rex Cronon: got disconnected
  • [15:56] Homeless: I: think Coyot is becoming over cooked.
  • [15:56] Marigold Devin: Doesn't smell like chicken
  • [15:56] Kallista Destiny: Or I' send you for a ried in the furry with the syringe on top.
  • [15:56] Liisa Runo: [15:57
  • [15:56] Coyot Linden: coyot - the other white meat
  • [15:57] Coyot Linden: More seriously, I don't have anything else
  • [15:57] Kallista Destiny: Silly Rabbi, kicks are for trids.
  • [15:57] Rex Cronon: liisa. it doesn't have anything to do with that. is just an old experiment
  • [15:58] Leonel Iceghost: now if you say it is possible, to limit time given to scripts per parcel would be amazing..
  • [15:58] Gooden Uggla: does anyone know why Meeroos started flying to pieces after the last roll?
  • [15:58] Gooden Uggla: they were going to be the big fad after horses
  • [15:58] Oskar Linden: flying to pieces?
  • [15:58] Marigold Devin: wishful thinking
  • [15:58] Oskar Linden: what's the ji8ra for meroo dismemberment?
  • [15:59] Kallista Destiny: Lack of structrial integrity?
  • [15:59] Gooden Uggla: i'm not terribly fond of breedables, but they've become a big chunk of the economny
  • [15:59] Moundsa Mayo: Population explosion?
  • [15:59] Marigold Devin: hate them
  • [15:59] ac14 Hutson: same
  • [16:00] Oskar Linden: If I had an L$ for every resident that named a meroo after me in the hopes it would breed faster.....
  • [16:00] Kelly Linden: meeroos are probably effected by the sim performance issues.
  • [16:00] Marigold Devin: You'd have 1.25 L$
  • [16:00] Oskar Linden: 2 actually
  • [16:00] Oskar Linden:  :-p
  • [16:00] Kallista Destiny: Well thank you for the meeting
  • [16:00] Gooden Uggla: yes kelly, it'
  • [16:00] Gooden Uggla: it's mostly an aesthetic thing, tyhey function ok, they just stopped looking like meeroos
  • [16:01] Kelly Linden: they are taking longer to complete their animation frame than they used to, as far as I can tell
  • [16:01] Monty LindenMonty: Linden wants to see these sluggish meeroos....
  • [16:02] Kelly Linden: also, not all meeroos are effected, I'm guessing just on heavily loaded regions
  • [16:02] Coyot Linden: s/sluggish/slugs in/
  • [16:02] Rex Cronon: g2g tc all
  • [16:02] Marigold DevinMarigold: Devin waves bye, and thanks to all. Great meeting.
  • [16:03] Monty Linden: ta!
  • [16:03] ac14 Hutson: see ya
  • [16:03] Moundsa Mayo: Thanks for you time, Oskar, Maestro, Monty, Kelly, Coyot, Caleb!
  • [16:03] Maestro Linden: thanks for coming everybody!
  • [16:03] Oskar Linden: l8r)