Talk:Project Motivation

First paragraph

Shouldn't the first paragraph explain for which project this is the motivation off? Or there be some link back at the top, that says part off..

I just came to this page accidently, when searching for something else. I know what it is about, but for someone who doesn't, it isn't clear. Also the title makes you think this is a Project named Motivation and not the motivation for a project. The same goes for other pages, like the use cases. Frans Charming 05:23, 7 October 2007 (PDT)

There could be some confusion, yes. I'm surprised that there isn't a link to the parent in the title of each of these main namespace pages that are children of Architecture_Working_Group. However, note that this is not possible in the wiki generally, since many pages can point to the same child as a top-level entry. In this particular case it would probably work. As for changing all the children's names to make the AWG: namespace explicit ... that would break an immense number of links! Unless there is a tool to automatically change all references suitably, it's best avoided. And it wouldn't fix the offsite references and bookmarks that already exist anyway. :-) There are probably other solutions though. Anyone? --Morgaine Dinova 08:55, 8 October 2007 (PDT)

Scaling for events

I would like to add more scary numbers to the main namespace here.

If N people are interested in an event today, and the population grows tomorrow by a factor of M, then tomorrow N*M people will be interested in that event, assuming unchanging population demographics.

Every live music event by the top few SL musicians maxes out their event sim: let's assume that this means 100 people, although the demand is undoubtedly higher already but not satisfied. SL currently has just under 10m registered residents. If the scary number for total population size is 2 billion, then the scary number for event interest is 100*2000/10 = 20,000 to an event region, given unchanging demographics. (*)

Even scarier: note that the above figures derive from total population growth, a conventional parameter but not necessarily the most relevant. If one were to extrapolate on the basis of concurrent users (which I'll round up to 50k currently), then the scary number for event interest now becomes an audience of 100*50000/50 = 100,000 at the popular event, or a scale-up of 1000. That's some severe event population pressure, and reason for much discontent if ignored. The initial projection of 20,000 no longer seems so bad.

Of course, worldwide we do not have an unchanging demographic, but no matter by how much you want to reduce this figure as a result of this, the answer is still collosal. And bear in mind that there is much uniformity in eastern populations, just as there is in the west, so event interest within each cultural domain will be huge. And of course many events are cross-cultural.

Which brings me to the issue that nobody seems to want to tackle: the new architecture needs scalability for events. Desperately.

While I recognize fully that there are monumental hurdles in the way of achieving this, it is just an engineering problem with a number of known partial solutions and amenable to tradeoffs. Worse, not addressing it will leave us exactly where we are today, with zero scalability for events. And, very unhappily, SL will become the virtual world where almost everybody is barred from their favorite event. 19,900 people out of every 20,000 will have to stay at home. That's not the future system we want to build, in my view.

Please add at least 10,000 per region to the scary numbers, to focus the mind. --Morgaine Dinova 21:07, 24 September 2007 (PDT)
I now see that this is an extreme underestimate, by 2+ orders of magnitude: Use Case. -- Morgaine Dinova 05:57, 26 September 2007 (PDT)

I mentioned this at Zero's Office Hours today, but no traffic yet, so added it myself. --Morgaine Dinova 17:42, 25 September 2007 (PDT)

(*) Note that "preplanned attendance" could inflate this figure even further. If the event is of sufficient interest to warrant logging in at the required hour to attend, this has a far greater effect than population increase or concurrent user increase, since all are heading for the event rather than just a proportion. Other factors also suggest that the limiting case could be higher still:

Attendance can be automated, just like TV shows are commonly recorded, especially for live music events.
High profile events tend to be scheduled at prime time in a given sector, so huge native audiences can be expected.
Planning ahead to attend a prepaid concert is common and normal in RL, so could be expected to be so in SL as well.
Virtual attendance has few RL constraints, so the hour may still be reasonable across a large section of the world.
The popularity of celebrities on TV results in audiences of millions in a single country, so 20,000 seems quite small.
There is very little advertising of events currently, and no promotion. Improvements could have a collosal effect.

This strongly suggests that the figures have been strongly underestimated. It is also worth noting that, in practice, a system that scales to even a fraction of 20,000 is very likely to have the right architecture to scale substantially further, with only minor improvements.

Zha's comments on events

The event discussion came up at the AWG kickoff, and comes up from time to time in Zero's office hours. At the heart of the event discussion is the desire to host very large events inside the SecondLife environment. This raises the question, what will 200, 500, 1000, 10,000, 20,000 users be able to do in a single contiguous portion of second life space. Assuming for a moment, we could manage the simulator side space, what would a client see? At 100 avatars in a sim, my current screen is pretty much solid avatars with a cloud of tags floating over them, all in constant motion. (Granted we're also all in concrete, in the current sims, but we'll assume that can be fixed.)

Somewhere north of 500 avatars, I can't see how I could render more than a fraction of them, nor could I interact with more than those I could see right near me. So, and I am quite serious about this question, what are we attempting to achieve. If the answer is "let us all share the stream" I'll point out that the streams are orthogonal to SecondLife. If it is interactivity, in the current style of SecondLife, I'd like to see a use case for how we could produce an interesting user experience with the number of avatars I listed above in ear shot. How fast will the chat window scroll? How fast will people's "nearby" windows update? How many avatars do we expect our scripted objects to be able to sense, and how will we even see most of the avatars

Beyond the user experience, we also should look at some hard technical questions. How many Kbits per seconds of updates are needed to support each of these size points? Can we get that from current networking hardware. Look at these questoins, Simulator to Client, Simulator to adjacent Simulators, and Simulator to utilities, and agent domains.

This isn't a simple "let's quash the idea" post, far from it. But If we are asserting this as a desire, lets actually produce some use cases which describe what we are trying to enable. "20,000 in an event" isn't a requirement I can design against. Give use some use cases which describe the actual user experiences that we wish to enable. Look hard at questions like "How will chat look, how will visibility work, how would scripting behave." Capture these in use cases.

I will suggest, for example, that some use case can be solved by better inter-region behavior. Others, would only be solved by attempting to limit the interactions between avatars in significant ways. Serious discussion, backed by used cases, would be invaluable here.

- User:Zha Ewry 9/25/2007

A few answers, then Use Cases etc

Zha, let me start off with a serious but not particularly helpful response in the form of a rebuttal: we have 20k-100k+ people events in RL taking place every weekend in thousands of places worldwide, and it seems to work just fine. So, we know that it can work in the physical world, and that it is a very human thing to do. And the 3D virtual world is to a very large extent a recreation of physical space, but with added capabilities, so this is both desireable for people and natural in a virtual recreation of physical space. The only problem is that it's hard to do, and it's not on a well beaten trail.

Seen in that light, the question I want to entertain is "How can we achieve this?". I don't feel the need to defend the desire for it, and anyone who follows the SL live music scene and is hitting their head daily against lack of scalability for events would I'm sure express that same feeling just as forcefully. It's a bad state of affairs, and it needs fixing, particularly in the context of an ambitious project that addresses all other areas of scalability. Let's just do it. (Bit by bit of course.)

Now to answer your questions specifically:

What will 200-20,000 users be able to do in a single contiguous portion of SL space?

Exactly what they do in RL. Sit in seats in stadia. Gather in crowds at rallies and concerts. Roam in the park in smaller groups. Join hands in a 2000-person line dance. Sky-dive in 500-person snowflake patterns. I don't know, anything. That's up to them. I really don't think we need use cases for this. Just open the curtains or turn on the telly for use cases. :-) The first candidate is among us right now though: SL live music needs 500-person capability today. (Here's a use case though, with utterly massive scaling and potential: Interactive TV.)

What would a client see? ... Somewhere north of 500 avatars, I can't see how I could render more than a fraction of them.

First of all, let's not be hung up on the current client. This will change drastically, and in any case there will be umpteen different clients, each suitable for a different purpose. (It's one of the reasons why I'm trying to partition the client into graphics + front-end.) No doubt we will be creating crowd-capable clients as well ... if region scalability allows a crowd to gather in the first place.

So, the client would see exactly what a person sees in a crowd in RL, but much more flexibly. Numerous options for this have been explored in the SL forums, from simple reduced detail rendering, rapid-falloff CLOD, to performance-driven culling, and selective viewing. Unlike in RL, you could select to render only those in your group, or only furries, or only pretty girls (OK, that's not much of an object reduction :P). Many people have suggested that ghost outlines would suit them just fine, and even more want a GuildWars-type "Observer Mode" in which their avatars are not rendered at all for others to see, which would reduce the problem substantially if it proved popular. So, there are lots of options.

nor could I interact with more than those I could see right near me.

And nor would you in RL, unless you're shouting, so this works pretty accurately. It would however be much easier to interact with your friends here since you could make everyone else disappear. Fortunately sound volume already falls off with distance in SL so that part of interaction is covered. Reading vicinity text scroll is clearly not viable in crowds, so this would need to be filtered in some manner chosen by each person, such as by group membership or friends only, or by distance just like sound. None of this is hard.

How many avatars do we expect our scripted objects to be able to sense?

This is actually a much more interesting question than the others. Even with the huge improvement that Mono will bring about, it's going to need more than faster LSL processing to do a good job in this area. It's going to require substantial improvements in sensing functions, and probably some language assist in the form of proper O(1) indexible arrays (at long last). But then, land owners have been crying out for better sensing functions for ages. It's time for it.

I think I've answered your general questions in debating style Zha (the questions were in that style too), but the detail has to be tackled far more precisely of course, and I'm ready for that. At this stage though, I'm looking merely for acceptance of the basic premise, particularly from Zero. I don't want to be on the defensive, as if this single area of scalability were somehow taboo. And that's how it has seemed over these past 3-4 years. :-) --Morgaine Dinova 19:43, 25 September 2007 (PDT)

SL events as simple data streams

It could very well be possible to seperate the asset data from the event data, in the case of large-attendance SL events. By this I mean: the viewer receives the asset data in a scalable preliminary fashion (by swarmcasting for example). Upon reception of this required data, the viewer can then simply receive and decode the corresponding SL event as a data strem which contains the timely object updates, chat notifications, sound triggers, avatar animations, etc... required to experience the event. This datastream could even then be saved in a sort of SL-movie format, playable with the viewer as if it were connected to a sim and receiving data from it.

In fact this data stream (assuming the event is non-interactive) could be swarmcasted too. --Jesrad Seraph 07:38, 10 October 2007 (PDT)

Categorizing event capabilities

I'm still not sure I understand how this would work, but I think asking the questions and making suggestions is sure bringing up some interesting ideas. It seems (to me at least) that there are 3 segments.

Unlimited interaction

Everyone at the event can interact with everything they see and everyone can see them and their actions effects. The issue here is scalability of visually displaying everyone and their interactions. Something along the lines of duplicate but separate locations could allow more people here (I'm thinking along the lines of chat rooms) though this may not work for all cases.

Limited interaction

Everyone can interact with some of what they see, either by filtering what they see, or actually limiting the interaction capabilities (touch, speak, chat). This seems like the interactive TV case. Here, everyone can watch the same event (a speaker, a concert, objects on display) but not everyone can speak or chat, or appear on stage (except maybe to a small group's view), or change the environment (move objects).

Highly limited interaction

Everyone is able to view, but can interact only in an extremely limited ability. This is basically view only capability with maybe the exception of the camera view and angle. All other interaction would not occur within the event, but outside the event like talking with friends on IM channels and such.

Now it would seem to me that each event could have users in all 3 categories at the same event, but their would be a limited number of users in each category allowed at the event (increasing limit going down the list.) The best thing, would probably be the ability for events to support all kinds, but for event creators to choose the categories they want to allow and who is allowed within those categories. --Anthony Reisman 08:25, 28 September 2007 (PDT)

Re above interaction analysis

Anthony, you've made a capability breakdown by extent of perceived human interaction. This is fine, but it doesn't relate well to the analysis of scalability for events that I was making, and what's more it's actually incorrect from the technical standpoint too. The various SecondlifeTube-type use cases are predicated on the complete inability of one observer to affect the world as seen by any other observer. Because you see, if one observer can do so, then potentially the millions of other observers in this use case could as well, so every client would have to deal with processing everyone else's visual changes, and the result of that is total meltdown. Or in other words, the unlimited scalability for events of this use case would be broken.

In summary, your second item is incorrect in stating that Limited interaction is the Interactive TV use case, because both that and the Non-Interactive TV use case have zero interaction with the region (in the sense of affecting region state) and between the clients. That is the basis of their massive scalability for events. Navigating your camera doesn't qualify as an interaction at all, because it affects only your visual and isn't detectable by anyone else. Although it's a matter of interest to the user whether she can move her camera or not, it has no impact on the client's scalability for events. And scalability was the only subject relevant to this particular namespace page.

I do like the idea you've started here, because classifying the various extents of scalability for events on the basis of the interaction between the clients can show us the possibilities available very clearly. The 3 fundamental extents would be the two extremes of the spectrum and one in the middle:

Full interaction between clients

This is where we are currently. Scalability is not just "poor", it is nil, because with just a small increase in the size of events, clients collapse. We hit the limits 3+ years ago, and the population has been growing semi-exponentially ever since, so perhaps we should call this negative scalability for events. ;-)

Limited interaction between clients

Cutting down the extent to which our clients process other clients' visuals offers a substantial avenue towards achieving client scalability for events (mass gatherings) while still being able to interact with others, particularly if the client doesn't just ignore unwanted objects but can request that the server end does not send them. This approach is quite likely to scale clients to many thousands of participants present at a given event, relatively easily.

Zero interaction between clients

This is the SecondlifeTube "umbrella" use case, arbitrarily scalable at the client end to millions and beyond (hence Y..Tube). And despite the exceptional scalability of this use case, it is quite simple to implement, much more so than the intermediate case.

Perhaps we can collapse your and my classifications into a single one? (But the topic here *is* scalability.) --Morgaine Dinova 11:11, 28 September 2007 (PDT)

Just a thought on the categorization here: "the event" and "the interaction between clients" can be rationalized as (formally) independant data flux. My viewer receives a data stream that corresponds to "the event", and another seperate (stateless or not) data stream that corresponds to my interaction with another spectator, and another, etc... One thing we know from RL is that everyone's entire experience of any event is unique, in that it is a specific combination of experiencing "the event" itself and interaction with other people. Even in RL the amount of interacting the spectators can have with the event itself is limited, unless we're talking about a decentralized event, which is another problem (in this sort of event "the event itself" is null, entirely replaced with interactions between attendants). So if I was to categorize events, that would be by the number of those data streams between attendants that compose, in the end, what we call "the event". All the other "interaction" streams that can be added have no impact on the nature of the event.
This means that, if the client of a spectator can maintain the distinction between "the event" and "the rest of the interaction", any client can retransmit "the event" to someone else who will then experience it in the same conditions. That's the kind of scalability allowed by caching and swarmcasting today in other domains.
Now let's add causality: we cannot expect people to interact post facto with what has already been interacted with by other people before them. To draw an analogy, if a part of the audience is asked "which song next ?", and the song of their choice then starts being performed, the rest of the audience that might experience this with a delay cannot participate in the previous interaction. In this example, "the event" has been irreversibly combined with a number of interaction streams, into a new, more complex stream. This combination becomes "the event" for the rest of the audience. This loss of interactivity potential with growing attendance is normal, it's entropy, there's no way around it and it'll be there no matter what we do.
This leads me to imagine a cascading model of event broadcasting, with two defining characteristics for categorizing events. The "height of the cascade" determines how much re-broadcasting happens, and the "width of the cascade" determines how much distinct interaction between attendants happens at the event but is orthogonal to it (it is not part of it). Because of causality, any amount of interaction that is not orthogonal, which changes the event, can only go downstream and not up: it adds some data and removes some (entropy is introduced there). And because of this inevitable entropy, final experience is different for everyone, much like a drop of water might fall to the left or to the right if the cascade widens: the more the horizontal interaction perturbs the flow, the more varied experiences by attendants. In a typical "SLTube" model we have a straight, tall and thin cascade ; while a thousand-players free-for-all snowball fight in SL is a single, widely spread and shallow blob of water that falls through in one pass.
I think these two characteristics help understand the technical constraints and the complexity of scaling SL for various types of events. The volume of "water" is the amount of data, it determines the bandwidth capacity required and that's where load balancing is to be applied. The height of the cascade is the duration of the event in absolute terms, it's how long the event can be experienced after it has started, and that's where storage capacity (caching, piggybacking, proxies or replay solutions) are to be applied.
However we see that there is no cut categories for events, but two dimensions in which they may stretch in continuous ways. For all these reasons I think we can only identify these characteristics of events, not discrete types of events themselves. --Jesrad Seraph 03:03, 11 October 2007 (PDT)

Use cases

For reference, use cases for scalability for events will be integrated into Use_Cases. --Morgaine Dinova 05:20, 26 September 2007 (PDT)

First use case now added: Guild Wars "Observer Mode" -- Interactive TV.

This is the interactive TV scenario, so it's pretty likely to become the killer app for virtual worlds. :-) In terms of scalability for events, it bypasses almost all of the client-side difficulties, since there can be a huge numbers of observer presences in the region but their avatars are not rendered, despite each one having a position in space for personal camera control. --Morgaine Dinova 05:14, 26 September 2007 (PDT)

This use case makes it clear that 20,000 per region is a ridiculously low estimate, since a few million people watch popular TV shows every day even in a small country. Given SL2's projected massive worldwide audience, an event viewable in this manner could attract dozens of millions of interactive observers.
Philip was on UK television yesterday, envisaging an SL with near-RealLife quality graphics. It doesn't require much imagination to see that this could turn "Interactive TV" from a use case to the biggest thing ever. :P --Morgaine Dinova 06:32, 26 September 2007 (PDT)

I added an SL-specific use case developed from the Guild Wars one: Interactive TV. --Morgaine Dinova 15:10, 26 September 2007 (PDT)

Tillie Ariantho suggested a passive viewing version of the above, so I also added Non-Interactive TV. This will probably appeal to old-fashioned machinima or event directors for whom giving the user camera control is anathema. ;-) --Morgaine Dinova 15:10, 26 September 2007 (PDT)

Added the umbrella label SecondlifeTube to cover all massively scalable avatar-free use cases. --Morgaine Dinova 00:03, 28 September 2007 (PDT)

Scary numbers

Each of the areas of scalability in Project_Motivation actually needs its own assessment. It's not really as simple as just deciding on an arbitrary fraction of world population. Scalability of a region for events derives from size of user base, but what about the other 3 numbers?

Also, each area introduces its own particular problems and contains its own set of possible solutions, which should be listed. One of these should then link to the chosen implementation, with appropriate engineering reasoning given. All of this seems to be missing at the moment, and we're just presented with a fait accompli, which is not engineering. This needs some improvement. --Morgaine Dinova 03:44, 26 September 2007 (PDT)

Added a title for each scary number, naming the dimension of scalability. --Morgaine Dinova 05:16, 1 October 2007 (PDT)

World vs Universe

It's worth pointing out (perhaps in the main namespace) that the distinction between world and universe is intentionally left vague. We just don't know how things will develop at this stage.

Third party worlds will undoubtedly arise, but it is very uncertain whether this will happen by small-scale accretion with LL's grid, by continental attachment of large 3rd-party grids, by resource migration rather than world extension, with varying degrees of compatibility from total to almost none, or whether entirely different concepts of scaling will arise through innovation by competitors (eg. maybe the user will become the focus of multi-dimensional scalability into numerous separate worlds).
To complicate matters further, attaching small third party worlds to a large grid will almost certainly require infrastructure support from the larger one, otherwise the large-grid residents would suffer very poor quality of service, and this would be a disincentive to world expansion (this has been addressed in Brainstorming). This makes the distinction between world and universe less in practice/implementation than in concept. --Morgaine Dinova 05:19, 1 October 2007 (PDT)

Additional scaling factors

New section added: more scaling pressure with increases in prim and script numbers. --Morgaine Dinova 22:14, 5 October 2007 (PDT)
Added the explosion in growth that will occur as a result of TV-led events like the forthcoming CSI:NY/SL. --Morgaine Dinova 07:03, 13 October 2007 (PDT)
Jesrad, your final line in the anti-growth section isn't an "anti" at all, it's a "pro"!! :-) It suggests a technical means of achieving greater levels of scaling, by reducing the update overhead. Since we notice only so many minor updates per second, they can be collapsed and even omitted if small enough ... pretty much the approach that MPEG uses to reduce the bandwidth required by digital video streaming. I left it in, but it's not really a constraint at all, and it's remarkably close to "640k is enough for anyone!". :-) --Morgaine Dinova 07:09, 13 October 2007 (PDT)

Wolves at the gates

Excellent addition, Strife, and central to Project Motivation! I guess Sony HOME and Google MyWorld will be candidates for this section too. --Morgaine Dinova 03:59, 2 October 2007 (PDT)

A shiver ran through me Strife, when I realized that you'd added the subtitle "Competitors" to this section ... because that implies that there are wolves at the gate other than mere competitors. And that made me focus on reality: the worst wolf isn't actually a wolf (which merely competes to survive) at all, but a 10-ton vicious predator ... the coercive T-Rex that is the politician surrounded by his retinue of lobbyists, litigants, Luddites, puritans, and vested interests. The real enemy isn't competition, as the playing field is level there. It's much worse. --Morgaine Dinova 06:38, 4 October 2007 (PDT)

Rob, good comment in the changelog about Strife's section title and the link to a commercial site. :-) Perhaps Strife could abstract the key elements of that system for us instead. After all, AWG's focus is not only on scalability but also on interoperability. The existence of commercial offerings like that one certainly needs to be borne in mind, since they will be providing infrastructure for other virtual worlds with which in time we will need to interoperate. And any claimed figures related to scalability are of particular interest! :-) --Morgaine Dinova 17:37, 10 October 2007 (PDT)