Difference between revisions of "Talk:Project Motivation"

Revision as of 03:51, 29 September 2007

Scaling for events

I would like to add more scary numbers to the main namespace here.

If N people are interested in an event today, and the population grows tomorrow by a factor of M, then tomorrow N*M people will be interested in that event, assuming unchanging population demographics.

Every live music event by the top few SL musicians maxes out their event sim: let's assume that this means 100 people, although the demand is undoubtedly higher already but not satisfied. SL currently has just under 10m registered residents. If the scary number for total population size is 2 billion, then the scary number for event interest is 100*2000/10 = 20,000 to an event region, given unchanging demographics.

Of course, worldwide we do not have an unchanging demographic, but no matter by how much you want to reduce this figure as a result of this, the answer is still collosal. And bear in mind that there is much uniformity in eastern populations, just as there is in the west, so event interest within each cultural domain will be huge. And of course many events are cross-cultural.

Which brings me to the issue that nobody seems to want to tackle: the new architecture needs scalability for events.

While I recognize fully that there are monumental hurdles in the way of achieving this, it is just an engineering problem with a number of known partial solutions and amenable to tradeoffs. Worse, not addressing it will leave us exactly where we are today, with zero scalability for events. And, very unhappily, SL will become the virtual world where almost everybody is barred from their favorite event. 19,900 people out of every 20,000 will have to stay at home. That's not the future system we want to build, in my view.

Please add at least 10,000 per region to the scary numbers, to focus the mind. --Morgaine Dinova 21:07, 24 September 2007 (PDT)
I now see that this is an extreme underestimate, by 2+ orders of magnitude: Use Case. -- Morgaine Dinova 05:57, 26 September 2007 (PDT)

I mentioned this at Zero's Office Hours today, but no traffic yet, so added it myself. --Morgaine Dinova 17:42, 25 September 2007 (PDT)

Zha's comments on events

The event discussion came up at the AWG kickoff, and comes up from time to time in Zero's office hours. At the heart of the event discussion is the desire to host very large events inside the SecondLife environment. This raises the question, what will 200, 500, 1000, 10,000, 20,000 users be able to do in a single contiguous portion of second life space. Assuming for a moment, we could manage the simulator side space, what would a client see? At 100 avatars in a sim, my current screen is pretty much solid avatars with a cloud of tags floating over them, all in constant motion. (Granted we're also all in concrete, in the current sims, but we'll assume that can be fixed.)

Somewhere north of 500 avatars, I can't see how I could render more than a fraction of them, nor could I interact with more than those I could see right near me. So, and I am quite serious about this question, what are we attempting to achieve. If the answer is "let us all share the stream" I'll point out that the streams are orthogonal to SecondLife. If it is interactivity, in the current style of SecondLife, I'd like to see a use case for how we could produce an interesting user experience with the number of avatars I listed above in ear shot. How fast will the chat window scroll? How fast will people's "nearby" windows update? How many avatars do we expect our scripted objects to be able to sense, and how will we even see most of the avatars

Beyond the user experience, we also should look at some hard technical questions. How many Kbits per seconds of updates are needed to support each of these size points? Can we get that from current networking hardware. Look at these questoins, Simulator to Client, Simulator to adjacent Simulators, and Simulator to utilities, and agent domains.

This isn't a simple "let's quash the idea" post, far from it. But If we are asserting this as a desire, lets actually produce some use cases which describe what we are trying to enable. "20,000 in an event" isn't a requirement I can design against. Give use some use cases which describe the actual user experiences that we wish to enable. Look hard at questions like "How will chat look, how will visibility work, how would scripting behave." Capture these in use cases.

I will suggest, for example, that some use case can be solved by better inter-region behavior. Others, would only be solved by attempting to limit the interactions between avatars in significant ways. Serious discussion, backed by used cases, would be invaluable here.

- User:Zha Ewry 9/25/2007

A few answers, then Use Cases etc

Zha, let me start off with a serious but not particularly helpful response in the form of a rebuttal: we have 20k-100k+ people events in RL taking place every weekend in thousands of places worldwide, and it seems to work just fine. So, we know that it can work in the physical world, and that it is a very human thing to do. And the 3D virtual world is to a very large extent a recreation of physical space, but with added capabilities, so this is both desireable for people and natural in a virtual recreation of physical space. The only problem is that it's hard to do, and it's not on a well beaten trail.

Seen in that light, the question I want to entertain is "How can we achieve this?". I don't feel the need to defend the desire for it, and anyone who follows the SL live music scene and is hitting their head daily against lack of scalability for events would I'm sure express that same feeling just as forcefully. It's a bad state of affairs, and it needs fixing, particularly in the context of an ambitious project that addresses all other areas of scalability. Let's just do it. (Bit by bit of course.)

Now to answer your questions specifically:

What will 200-20,000 users be able to do in a single contiguous portion of SL space?

Exactly what they do in RL. Sit in seats in stadia. Gather in crowds at rallies and concerts. Roam in the park in smaller groups. Join hands in a 2000-person line dance. Sky-dive in 500-person snowflake patterns. I don't know, anything. That's up to them. I really don't think we need use cases for this. Just open the curtains or turn on the telly for use cases. :-) The first candidate is among us right now though: SL live music needs 500-person capability today. (Here's a use case though, with utterly massive scaling and potential: Interactive TV.)

What would a client see? ... Somewhere north of 500 avatars, I can't see how I could render more than a fraction of them.

First of all, let's not be hung up on the current client. This will change drastically, and in any case there will be umpteen different clients, each suitable for a different purpose. (It's one of the reasons why I'm trying to partition the client into graphics + front-end.) No doubt we will be creating crowd-capable clients as well ... if region scalability allows a crowd to gather in the first place.

So, the client would see exactly what a person sees in a crowd in RL, but much more flexibly. Numerous options for this have been explored in the SL forums, from simple reduced detail rendering, rapid-falloff CLOD, to performance-driven culling, and selective viewing. Unlike in RL, you could select to render only those in your group, or only furries, or only pretty girls (OK, that's not much of an object reduction :P). Many people have suggested that ghost outlines would suit them just fine, and even more want a GuildWars-type "Observer Mode" in which their avatars are not rendered at all for others to see, which would reduce the problem substantially if it proved popular. So, there are lots of options.

nor could I interact with more than those I could see right near me.

And nor would you in RL, unless you're shouting, so this works pretty accurately. It would however be much easier to interact with your friends here since you could make everyone else disappear. Fortunately sound volume already falls off with distance in SL so that part of interaction is covered. Reading vicinity text scroll is clearly not viable in crowds, so this would need to be filtered in some manner chosen by each person, such as by group membership or friends only, or by distance just like sound. None of this is hard.

How many avatars do we expect our scripted objects to be able to sense?

This is actually a much more interesting question than the others. Even with the huge improvement that Mono will bring about, it's going to need more than faster LSL processing to do a good job in this area. It's going to require substantial improvements in sensing functions, and probably some language assist in the form of proper O(1) indexible arrays (at long last). But then, land owners have been crying out for better sensing functions for ages. It's time for it.

I think I've answered your general questions in debating style Zha (the questions were in that style too), but the detail has to be tackled far more precisely of course, and I'm ready for that. At this stage though, I'm looking merely for acceptance of the basic premise, particularly from Zero. I don't want to be on the defensive, as if this single area of scalability were somehow taboo. And that's how it has seemed over these past 3-4 years. :-) --Morgaine Dinova 19:43, 25 September 2007 (PDT)

Categorizing event capabilities

I'm still not sure I understand how this would work, but I think asking the questions and making suggestions is sure bringing up some interesting ideas. It seems (to me at least) that there are 3 segments.

Unlimited interaction

Everyone at the event can interact with everything they see and everyone can see them and their actions effects. The issue here is scalability of visually displaying everyone and their interactions. Something along the lines of duplicate but separate locations could allow more people here (I'm thinking along the lines of chat rooms) though this may not work for all cases.

Limited interaction

Everyone can interact with some of what they see, either by filtering what they see, or actually limiting the interaction capabilities (touch, speak, chat). This seems like the interactive TV case. Here, everyone can watch the same event (a speaker, a concert, objects on display) but not everyone can speak or chat, or appear on stage (except maybe to a small group's view), or change the environment (move objects).

Highly limited interaction

Everyone is able to view, but can interact only in an extremely limited ability. This is basically view only capability with maybe the exception of the camera view and angle. All other interaction would not occur within the event, but outside the event like talking with friends on IM channels and such.

Now it would seem to me that each event could have users in all 3 categories at the same event, but their would be a limited number of users in each category allowed at the event (increasing limit going down the list.) The best thing, would probably be the ability for events to support all kinds, but for event creators to choose the categories they want to allow and who is allowed within those categories. --Anthony Reisman 08:25, 28 September 2007 (PDT)

Re above interaction analysis

Anthony, you've made a capability breakdown by extent of perceived human interaction. This is fine, but it doesn't relate well to the analysis of scalability for events that I was making, and what's more it's actually incorrect from the technical standpoint too. The various SecondlifeTube-type use cases are predicated on the complete inability of one observer to affect the world as seen by any other observer. Because you see, if one observer can do so, then potentially the millions of other observers in this use case could as well, so every client would have to deal with processing everyone else's visual changes, and the result of that is total meltdown. Or in other words, the unlimited scalability for events of this use case would be broken.

In summary, your second item is incorrect in stating that Limited interaction is the Interactive TV use case, because both that and the Non-Interactive TV use case have zero interaction with the region and between the clients. That is the basis of their massive scalability for events. Navigating your camera doesn't qualify as an interaction at all, because it affects only your visual and isn't detectable by anyone else. Although it's a matter of interest to the user whether she can move her camera or not, it has no impact on the client's scalability for events. And scalability was the only subject relevant to this particular namespace page.

I do like the idea you've started here, because classifying the various extents of scalability for events on the basis of the interaction between the clients can show us the possibilities available very clearly. The 3 fundamental extents would be the two extremes of the spectrum and one in the middle:

Full interaction between clients

This is where we are currently. Scalability is not just "poor", it is nil, because with just a small increase in the size of events, clients collapse. We hit the limits 3+ years ago, and the population has been growing semi-exponentially ever since, so perhaps we should call this negative scalability for events. ;-)

Limited interaction between clients

Cutting down the extent to which our clients process other clients' visuals offers a substantial avenue towards achieving client scalability for events (mass gatherings) while still being able to interact with others, particularly if the client doesn't just ignore unwanted objects but can request that the server end does not send them. This approach is quite likely to scale clients to many thousands of participants present at a given event, relatively easily.

Zero interaction between clients

This is the SecondlifeTube "umbrella" use case, arbitrarily scalable at the client end to millions and beyond (hence Y..Tube). And despite the exceptional scalability of this use case, it is quite simple to implement, much more so than the intermediate case.

Perhaps we can collapse your and my classifications into a single one? (But the topic here *is* scalability.) --Morgaine Dinova 11:11, 28 September 2007 (PDT)

Use cases

For reference, use cases for scalability for events will be integrated into Use_Cases. --Morgaine Dinova 05:20, 26 September 2007 (PDT)

First use case now added: Guild Wars "Observer Mode" -- Interactive TV.

This is the interactive TV scenario, so it's pretty likely to become the killer app for virtual worlds. :-) In terms of scalability for events, it bypasses almost all of the client-side difficulties, since there can be a huge numbers of observer presences in the region but their avatars are not rendered, despite each one having a position in space for personal camera control. --Morgaine Dinova 05:14, 26 September 2007 (PDT)

This use case makes it clear that 20,000 per region is a ridiculously low estimate, since a few million people watch popular TV shows every day even in a small country. Given SL2's projected massive worldwide audience, an event viewable in this manner could attract dozens of millions of interactive observers.
Philip was on UK television yesterday, envisaging an SL with near-RealLife quality graphics. It doesn't require much imagination to see that this could turn "Interactive TV" from a use case to the biggest thing ever. :P --Morgaine Dinova 06:32, 26 September 2007 (PDT)

I added an SL-specific use case developed from the Guild Wars one: Interactive TV. --Morgaine Dinova 15:10, 26 September 2007 (PDT)

Tillie Ariantho suggested a passive viewing version of the above, so I also added Non-Interactive TV. This will probably appeal to old-fashioned machinima or event directors for whom giving the user camera control is anathema. ;-) --Morgaine Dinova 15:10, 26 September 2007 (PDT)

Added the umbrella label SecondlifeTube to cover all massively scalable avatar-free use cases. --Morgaine Dinova 00:03, 28 September 2007 (PDT)

Scary numbers

Each of the areas of scalability in Project_Motivation actually needs its own assessment. It's not really as simple as just deciding on an arbitrary fraction of world population. Scalability of a region derives from size of user base, but what about the other 3 numbers?

Also, each area introduces its own particular problems and contains its own set of possible solutions, which should be listed. One of these should then link to the chosen implementation, with appropriate engineering reasoning given. All of this seems to be missing at the moment, and we're just presented with a fait accompli, which is not engineering. This needs some improvement. --Morgaine Dinova 03:44, 26 September 2007 (PDT)

@@ Line 71: / Line 71: @@
 * Anthony, you've made a capability breakdown by extent of perceived human interaction.  This is fine, but it doesn't relate well to the analysis of scalability for events that I was making, and what's more it's actually incorrect from the technical standpoint too.  The various [[Use_Cases#SecondlifeTube|SecondlifeTube]]-type use cases are predicated on the ''complete inability'' of one observer to affect the world as seen by any other observer.  Because you see, if one observer can do so, then potentially the millions of other observers in this use case could as well, so every client would have to deal with processing everyone else's visual changes, and the result of that is total meltdown.  Or in other words, the unlimited scalability for events of this use case would be broken.
-* In summary, your second item is incorrect in stating that ''Limited interaction'' is the [[Use_Cases#Interactive TV|Interactive TV]] use case, because both that and the [[Use_Cases#Non-Interactive TV|Non-Interactive TV]] use case have '''zero interaction''' with the region and between the clients.  That is the basis of their massive scalability for events.  Navigating your camera doesn't qualify as an interaction at all, because it affects only your visual and isn't detectable by anyone else.  Although it's a matter of interest to the user whether she can move her camera or not, it has no impact on scalability for events. And scalability was the only subject relevant to this particular namespace page.
+* In summary, your second item is incorrect in stating that ''Limited interaction'' is the [[Use_Cases#Interactive TV|Interactive TV]] use case, because both that and the [[Use_Cases#Non-Interactive TV|Non-Interactive TV]] use case have '''zero interaction''' with the region and between the clients.  That is the basis of their massive scalability for events.  Navigating your camera doesn't qualify as an interaction at all, because it affects only your visual and isn't detectable by anyone else.  Although it's a matter of interest to the user whether she can move her camera or not, it has no impact on the client's scalability for events. And scalability was the only subject relevant to this particular namespace page.
 * I do like the idea you've started here, because classifying the various extents of scalability for events on the basis of the interaction between the clients can show us the possibilities available very clearly. The 3 fundamental extents would be the two extremes of the spectrum and one in the middle: