Talk:LlListSort

From Second Life Wiki
Revision as of 20:38, 19 April 2014 by Miranda Umino (talk | contribs)
Jump to navigation Jump to search

Bubble Sort?

This script: <lsl> default {

   state_entry()
   {
       llOwnerSay(llList2CSV(llListSort([3,2,3,5,3,1,1,4], 2, TRUE)));
   }

} </lsl>

outputs

1, 4, 3, 5, 3, 1, 3, 2

both in LSO and in Mono. This means that the sort is not stable. Since stability is a characteristic of Bubble Sort, it looks like the claim of using Bubble Sort is not correct. --Pedro Oval 20:01, 5 April 2014 (PDT)

Good point. -- Strife (talk|contribs) 21:26, 5 April 2014 (PDT)

maybe... needs an extended example to see if LSL backwards ordering isn't the culprit
-- Void (talk|contribs) 22:19, 5 April 2014 (PDT)

If it's in reverse order, it's not stable either. The reverse would be 3,1,3,5,3,2 but the actual result is 3,5,3,1,3,2 --Pedro Oval 12:39, 6 April 2014 (PDT)

doh, you're right... comb sort perhaps? looking at the ordering exchange seems more likely than insertion
-- Void (talk|contribs) 17:20, 6 April 2014 (PDT)

Best option could be to pose this question on the scripting mail list, Kelly appears to prefer that channel for tech information dumps. The bubble sort presumption probably came from a function included with the viewer part of the source, but its comments describe list syntax and functions that never made it to LSL2. --ObviousAltIsObvious Resident 17:30, 6 April 2014 (PDT)
I think it was a semantic problem, the existing list functions don't leave any room for embedded lists. When they couldn't get pass-by-reference to work properly (assumption), they likely killed embedded lists because it didn't fit well in 16KiB.
Q: Why does the viewer source contain code that is obviously server LSL VM source code?
A: A bit of history, before the age of Mono, scripts were compiled by the client, and the viewer source included a copy of much of the server VM code. You have to remember that the viewer wasn't always open source, it's likely that the client and server code once existed in a single repository, that would be recompiled at the same time. When LL opensourced the viewer, they split the repository, but because of interdependence between the server and client code, some files were duplicated between the two repositories. LSL related files overtime diverged, largely driven by Mono's server side compilation. So while it may seem strange now that the viewer contains server source, it wasn't at the time the article was written.
-- Strife (talk|contribs) 19:08, 6 April 2014 (PDT)

It's probably not very polite to bring up this link, but it may shed some light on the reasons behind usage of Bubble Sort and the presumption that it was used in the server code: [1]. It mentions what it calls "type specific feathered sorting", which I assume it to mean the individual sorting of elements of each type. The code claims that "unoptimized bubble sort" is the only way to achieve it. I haven't given it much thought yet, but I wonder if comb sort, as mentioned by Void, would also preserve that ordering. --Pedro Oval 03:09, 8 April 2014 (PDT)

I think that 3 passes index, sort, write could do this as well. Miranda's comment in SVC-2988 suggests that there is some other kind of overhead in there. --ObviousAltIsObvious Resident 05:31, 8 April 2014 (PDT)
I've tried Quicksort and Comb Sort. My input list was [3,2,1,3,3,4,3,3,1,4,1,9,3,6] with stride 2. llListSort returns [1,3,1,4,1,9,3,3,3,2,3,4,3,6]. I could not reproduce this result with Comb sort, not even pre-inverting the array nor by pre-inverting, sorting descending and post-inverting it. I've tried two variations and all possible shrink values for this length. I've also tried with Quicksort with a decent amount of possible choices of pivot (first, central, last, median, and sorting the first, central and last elements, using several variations of < and <= for median and sorting), to no avail. --Pedro Oval 07:01, 8 April 2014 (PDT)
I wish I had more time to play with this. Going by what is typically found in LL code, my first guess would be "whatever the GNU library is using in std::sort". It could very well be a wrapper around that. Adding to fun, there are 2 major C++ libraries in GNU land, each uses a different approach. --ObviousAltIsObvious Resident 07:30, 8 April 2014 (PDT)
Boost? P.S. I've looked at opensim code. I have no problem with us documenting opensim functions etc. -- Strife (talk|contribs) 08:58, 8 April 2014 (PDT)
Same difference, as Boost uses std::sort: [2]. Digging a bit I have found this: Introsort#Implementations (Wikipedia) which made it sound like any attempt at reproducing the results without GNU libstdc++ was going to be too painful to be worth it. Then I've tried with the C++11 program below; it was sorted in a stable way, as the result is: "1,3, 1,4, 1,9, 3,2, 3,4, 3,3, 3,6, ". I'm willing to make more experiments but I'm out of ideas at the moment. Any suggestions? (P.S.: who chose the colors for the C++ highlighter? Yuck!) --Pedro Oval 09:30, 9 April 2014 (PDT)

<cpp>

  1. include <iostream>
  2. include <algorithm>
  3. include <vector>

struct Rec {

   int key;
   int data;

};

bool sortByKey(const Rec &lhs, const Rec &rhs) { return lhs.key < rhs.key; }

int main() {

   std::vector<Rec> v = {{3,2},{1,3},{3,4},{3,3},{1,4},{1,9},{3,6}};
   std::sort(v.begin(), v.end(), sortByKey);
   for (Rec &n : v)
       std::cout << n.key << "," << n.data << ", ";
   std::cout << std::endl;
   return 0;

} </cpp>

tried substituting partial_sort (I think that bypasses the insertion sort libstdc++ likes to use on small sets), almost the same as LSL but not close enough. 1,4, 1,3, 1,9, 3,3, 3,2, 3,4, 3,6, --ObviousAltIsObvious Resident 15:34, 9 April 2014 (PDT)


I don t know why you are using boost c++ library . Probably the key of the answer is inside the implementation of sort in the mono library .

I dont know which version mono is , and which algorithm it uses for sort . Nevertheless , the documentation from microsoft on dotnet c# gives :


from [ http://msdn.microsoft.com/en-us/library/b0zbh7b6(v=vs.110).aspx ]

This method uses the Array.Sort method, which applies the introspective sort as follows:
* If the partition size is fewer than 16 elements, it uses an insertion sort algorithm.
* If the number of partitions exceeds 2 * LogN, where N is the range of the input array, it uses a Heapsort algorithm.
* Otherwise, it uses a Quicksort algorithm.
This implementation performs an unstable sort; that is, if two elements are equal, their order might not be preserved. 
In contrast, a stable sort   preserves the order of elements that are equal.
On average, this method is an O(n log n) operation, where n is Count; in the worst case it is an O(n ^ 2) operation.


We should verify if the actual version of mono used by linden uses the same strategy as microsoft. But , at first view , it seems coherent with the actual observations


Other note : a bubble sort is O(n) if the list is already sorted . Some other algorithms too. And some other algorithms , not. But in LSL/mono , sortig an already sorted list don t make it faster

Miranda Umino 20:55, 17 April 2014 (PDT)

The sort works the same under LSO, so I believe we can rule out a native .net library call as the source of this behavior. --ObviousAltIsObvious Resident 00:29, 18 April 2014 (PDT)

I'm still researching, but I'll advance some information. I've reproduced the results with this script, which implements a very simple sort algorithm that is not listed in Wikipedia:

<lsl> list Sort(list input, integer stride) {

   integer i;
   integer j;
   for (i = 0; i < llGetListLength(input) - stride; i += stride)
   {
       for (j = i + stride; j < llGetListLength(input); j += stride)
       {
           if (llList2Integer(input, i) > llList2Integer(input, j))
           {
               // Swap records i & j
               input = llListReplaceList(
                           llListReplaceList(input, llList2List(input, j, j+stride-1), i, i+stride-1),
                           llList2List(input, i, i+stride-1), j, j+stride-1);
           }
       }
   }
   return input;

}

default {

   state_entry()
   {
       list MyList = [3,2,1,3,3,4,3,3,1,4,1,9,3,6];
       string L1 = llList2CSV(llListSort(MyList, 2, TRUE));
       string L2 = llList2CSV(Sort(MyList, 2));
       llOwnerSay("O: " + L1);
       llOwnerSay("N: " + L2);
       if (L1 != L2) llOwnerSay("Different"); else llOwnerSay("Equal");
   }

} // Output: // [09:50:20] Sort: O: 1, 3, 1, 4, 1, 9, 3, 3, 3, 2, 3, 4, 3, 6 // [09:50:20] Sort: N: 1, 3, 1, 4, 1, 9, 3, 3, 3, 2, 3, 4, 3, 6 // [09:50:20] Sort: Equal </lsl>

I realized it was a candidate while trying to understand how bubble sort would preserve the ordering as claimed by the OpenSim code. I couldn't, so I looked into the code in more detail, and I realized that the sorting algorithm used was not bubble sort, but the one implemented in the above script. It's a naïve algorithm similar to selection sort, but less optimal: it goes through each element, and does a sweep over the rest of the elements to see if there's one less than it. If so, it swaps them (selection sort just takes note of the index position of the least element, and swaps them at the end of the inner loop instead of as it goes). At the end of each sweep, the element in question will have the least of the remaining elements in the list. It's still O(n²), and worse in performance than selection sort (since it does more swaps), but it does indeed guarantee preservation of the order of the types as claimed, assuming that comparison of two elements of different types does not cause a swap. It's also not stable, by its nature, and as proved by the results above. --Pedro Oval 10:09, 19 April 2014 (PDT)

Update: Selection Sort fails to reproduce the results, which is expected. However, the above algorithm matched the results every time I've tried. I tested with this list generation code in place of the constant: <lsl>

       list MyList;
       while (llGetListLength(MyList) < 500)
       {
           MyList = MyList + (integer)llFrand(12) + llGetListLength(MyList);
       }

</lsl> It seems we have a winner. --Pedro Oval 10:51, 19 April 2014 (PDT)

Update 2: It gets better. Type feathering is NOT respected in case of a descending sort. OpenSim gives different results to SL in that case. It looks like type feathering is not a feature, but a side effect of the algorithm used. I've reproduced the results by always swapping when the types are different. The results I got are consistent with this: the comparison function returns TRUE if A and B are of the same type and A is greater than B, and FALSE otherwise, and the swap is done if (the comparison function returns TRUE) xor (ascending!=1), causing it to always swap elements of different types for descending sort, thus destroying the feathering. OpenSim respects the feathering for descending order. --Pedro Oval 14:38, 19 April 2014 (PDT)

One sure thing : llListSort is not O(N*N) . LLstsort is O(N * Log2(N) )

Here some results of your algorithm ( 2nd column = number of pairs ; 3rd column time in frames per iteration)

BubbleSort
NumberTest       N size  Average Time in Frames  
Test #1 with       2    0.020000
Test #2 with       3    0.050000
Test #3 with       4    0.090000
Test #4 with       6    0.200000
Test #5 with       8    0.200000
Test #6 with      12    0.650000
Test #7 with      16    0.970000
Test #8 with      24    1.850000
Test #9 with      32    2.300000
Test #10 with     48    5.590000
Test #11 with     64    7.540000
Test #12 with     96    18.270000
Test #13 with    128    28.950000
Test #14 with    160    53.440000
Test #15 with    192    71.920000
Test #16 with    224    90.050000
Test #17 with    256    112.140000

Time ( 2*128 ) = (2*2) * Time(128) => ALgorithm is O(N*N)

And with LListSort

LListSort	
Test #1 with       2    0.040000   
Test #2 with       3    0.070000
Test #3 with       4    0.040000
Test #4 with       6    0.050000
Test #5 with       8    0.070000
Test #6 with      12    0.150000
Test #7 with      16    0.080000
Test #8 with      24    0.210000
Test #9 with      32    0.320000
Test #10 with     48    0.340000
Test #11 with     64    0.430000
Test #12 with     96    0.280000
Test #13 with    128    0.670000
Test #14 with    160    0.650000
Test #15 with    192    0.760000
Test #16 with    224    0.660000
Test #17 with    256    0.840000

Time ( 2*128 ) = 2*log(2) * Time(128) => ALgorithm is O(N*LogN)

<lsl> list Sort(list input, integer stride) {

   integer i;
   integer j;

   for (i = 0; i < llGetListLength(input) - stride; i += stride)
   {
       for (j = i + stride; j < llGetListLength(input); j += stride)
       {
           if (llList2Integer(input, i) > llList2Integer(input, j))
           {
               // Swap records i & j
               input = llListReplaceList(
                           llListReplaceList(input, llList2List(input, j, j+stride-1), i, i+stride-1),
                           llList2List(input, i, i+stride-1), j, j+stride-1);
           }
       }
   }
   return input;

} default {

   state_entry() {
       list tests = [
           // Ascending lists
           2, TRUE,
           3, TRUE,
           4, TRUE,
           6, TRUE,
           8, TRUE, 
           12, TRUE,
           16, TRUE,
           24, TRUE,
           32, TRUE,
           48, TRUE,
           64, TRUE,
           96, TRUE,
           128, TRUE,
           160, TRUE,
           192, TRUE,
           224, TRUE,
           256, TRUE

/* 320, TRUE,

           384, TRUE,
           448, TRUE,
           512, TRUE 
           */
       ];
       integer repetitions = 100;
       integer test = 0; integer testsLength = tests != [];
        
       integer listSize; integer ascending; integer i; list t;
       float f; string s; list l = [];
       float nFrameStart;
       float nFrameEnd;
       float overhead;
       llOwnerSay("Starting tests…");
       for (; test < testsLength; test += 2) {
           listSize = llList2Integer(tests, test);
           ascending = llList2Integer(tests, test + 1);
            
           if (ascending) {
               i = 0;
               for (; i < listSize; ++i) 
                   l +=  [(integer)llFrand(3),llFrand(10000)] ;
           } else {
               i = listSize;
               while ((--i) >= 0) 
                   l +=  (integer)llFrand(10000);
           }
            
           i = 0;
           nFrameStart = (float)llGetEnv("frame_number");
           for (; i < repetitions; ++i) 
               t = [];
           nFrameEnd = (float)llGetEnv("frame_number");
           overhead = nFrameEnd - nFrameStart;
           i = 0;
           nFrameStart = (float)llGetEnv("frame_number");
           for (; i < repetitions; ++i) 
              // t = llListSort(l, 2, ascending);
              t = Sort(l,2);
           nFrameEnd = (float)llGetEnv("frame_number");      
           f =  nFrameEnd - nFrameStart - overhead;
           t = l = [];
           
           f /= (float)repetitions;
           s = "\tTest #" + (string)((test / 2) + 1) + " with\t" + (string)listSize + "\t" + (string)( f ); 
          
           llOwnerSay(s);
       }
       llOwnerSay("Tests complete.");
   }

} </lsl>

Miranda Umino 21:38, 19 April 2014 (PDT)