Talk:LlParseString2List

From Second Life Wiki
Revision as of 14:00, 21 February 2014 by Strife Onizuka (talk | contribs) (→‎Is there a limit?: yes)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Let's not have our examples confuse people

We give the example of "What Are You Looking At?" near the end of this document and include the "A" twice in the separator list. Think of how confusing this is to someone just learning this function and being unsure as to how the separators work. Joel Cloquet 01:54, 21 February 2014 (PST)

Is there a limit?

  • Folks,
  • Is there a limit to the number of items that this function will parse into a list? —The preceding unsigned comment was added by Bobby Fairweather
    • Yes there is a limit and it is based on the amount of free memory the script has. It really depends how much memory the user is already using, etc. —The preceding unsigned comment was added by Strife Onizuka

A replacement of the examples for llParseString2List and llParseStringKeepNulls.

I feel that the present examples are too confusing to read and don't really help teach what the functions do (I find the repeated <<>>><<<>><<> totally baffling). I tried to write an example myself but think it fails to show the functions at their best (it just sucks to be frank) but is easier to read and IMO clearer to understand what these functions do. Could we pull together and sort something out? -- Fred Gandt (talk|contribs) 01:36, 30 April 2010 (UTC)

  • Example script stage 2 -

<lsl>// IMPORTANT!!

// This code is here to aid an ongoing discussion and should not be used as an example of good code.

// IMPORTANT!!

default {

   state_entry()
   {
       // We have a string of text that we want to adapt.
       
       string source = "How much wood would a woodchuck chuck if a woodchuck could chuck wood?";
       
       list words = llParseString2List(source, ["much"], []);
       
       // This divides the string source into two elements at the point it finds "much" and removes "much"
       
       // This list is returned - ["How ", " wood would a woodchuck chuck if a woodchuck could chuck wood?"]
       
       // We can then feed the list to llDumpList2String to join the list elements using something new as glue.
       
       source = llDumpList2String(words, "many");
       
       // Our source is now "How many wood would a woodchuck chuck if a woodchuck could chuck wood?"
       
       // However, this can be simplified by feeding the return of llParseString2List directly into
       // llDumpList2String. Like so -
       
       source = llDumpList2String(llParseString2List(source, ["much"], []), "many");
       
       // That method makes the need for list words redundant.
       
       

       // We remove all instances of "chuck" and replace with "rez".
       source = llDumpList2String(llParseString2List(source, ["chuck"], []), "rez");
       // Our source is now "How many wood would a woodrez rez if a woodrez could rez wood?"

       // We remove all instances of "wood" and replace with "prims".
       source = llDumpList2String(llParseString2List(source, ["wood"], []), "prims");
       // Our source is now "How many prims would a primsrez rez if a primsrez could rez prims?"

       // We remove all instances of "primsrez" and replace with "primrez".
       source = llDumpList2String(llParseString2List(source, ["primsrez"], []), "primrez");
       // Our source is now "How many prims would a primrez rez if a primrez could rez prims?"

       llOwnerSay(source); // Say the result to the owner.
       
       

       // There is an issue though...

       source = "Peter Piper picked a peck of pickled pepper.";
       llOwnerSay(llDumpList2String(llParseString2List(source, ["P", "p"], []), "m"));
       
       //Returns "eter mimer micked a meck of mickled memer."

       // Notice that ALL the p's are gone and llDumpList2String adds "m" only where the split ocurred.

       // We can then use llParseStringKeepNulls instead.

       llOwnerSay(llDumpList2String(llParseStringKeepNulls(source, ["P", "p"], []), "m"));
       
       //Returns "meter mimer micked a meck of mickled memmer."
   }

}</lsl>

I like the pepper/ memer example
http://en.wikipedia.org/wiki/Comma-separated_values#Example is an instance of people solving a puzzle like this before us ...
-- Ppaatt Lynagh 01:49, 30 April 2010 (UTC)
Agreed. I think the value of llParseStringKeepNulls is far simpler to express when an understanding of llParseString2List is already established. I am way too tired to tackle this further right now but I am inspired by your interest Ppaatt. I hope between us (the community) we can create a couple of examples that really show off these functions. For example - What exactly is the use of the second list param (spacers)? -- Fred Gandt (talk|contribs) 02:23, 30 April 2010 (UTC)
I don't agree with the proposed example. Sure it shows you one use of the function but it doesn't give a the user an understanding of what llParseString2List is doing. While it is documented, the documentation doesn't actually talk about how it is performing what it is doing. From this example, you cannot really tell where the llDumpList2String functionality begins and the llParseString(KeepNulls|2List) functionality ends. It isn't plain to the user from this example that llParseString2List is returning a list or what the list looks like; it could be returning a black box for all the user knows. The way it is being used in this examples is a simple text replace and it's not even a memory efficient way of doing text replace. On the efficiency front, I'm inclined to say this is a Bad example, sure it works great on short strings, but give it a string that is 1024 or longer characters and have it replace every other character and you may just hit the memory limits. Sure the alternative is slow but it doesn't use six times the memory. So to some it up: It doesn't actually demonstrate the full use of this function (hello it doesn't even use spacers) and the example usage it provides is hazardous. I do however acknowledge that the current example is less than perfect and needs improvement. -- Strife (talk|contribs) 05:01, 30 April 2010 (UTC)
I don't want to be providing hazardous code in the examples as people will copy it. It will end up being used in products; in vendors. People will think that because it's in an example that it is good coding practice; that it has been vetted. This example has the potential to be a DoS attack vector: If this code is fed third party text the third party could crash the script. It would be pretty short sighted of us to provide a DoS attack vector in an example. If it ever became public that it was our fault for providing such an example it would damage the image of the LSL Portal. -- Strife (talk|contribs) 05:17, 30 April 2010 (UTC)
Number of years ago I tried to use this method for text replacement but the stack would occasionally collide with the heap and crash the script. It was suffice it to say, annoying. Ended up going back to using my own text replacement function (an early version of it can be found here near the bottom of the page). -- Strife (talk|contribs) 05:26, 30 April 2010 (UTC)
Ah ok. I have changed the script above rather than post another. I added explanation of the list return and where llDumpList2String fits in. I did say it was just a start. I also mentioned that spacers needed explanation. I think rather than saying "this bad" it would help to say what would constitute good use of this function. If not for string replacement...what is a "good" use? -- Fred Gandt (talk|contribs) 11:11, 30 April 2010 (UTC)
<.< I may have been a bit harsh. I am sorry. -- Strife (talk|contribs) 12:41, 30 April 2010 (UTC)

Here is how the function basically works: (llParseString2List is a bit more complicated) <lsl>//rough pseudocode for llParseString* functions //I've used some convenient fake functions to make the implementation easier to read. //They are for convenience sake only, you would never implement it this way. list llParseString2List(string str, list separators, list spacers) {

   list out = [];
   string buffer = str;
   while(buffer)
   {
       integer index = IndexOfFirstListItemInString(buffer, spacers + separators);
       if(index == -1)
           return out + buffer;
       else if(IsIndexInList(buffer, index, separators))
       {
           if(index)
               out += llDeleteSubString(buffer, index, -1);
           buffer = llDeleteSubString(buffer, 0, index + LengthOfListItemInStringAt(buffer, separators, index) - 1);
       }
       else //it's a spacer
       {
           if(index)
               out += llDeleteSubString(buffer, index, -1);
           out += ListItemInStringAt(buffer, spacer, index);
           buffer = llDeleteSubString(buffer, 0, index + LengthOfListItemInStringAt(buffer, spacer, index) - 1);
       }
   }
   return out;

}</lsl> For a complete LSL implementation, take a look at ParseString2List. It's very clever. -- Strife (talk|contribs) 12:41, 30 April 2010 (UTC)

Okie doke Strife. I'll take a proper look when I get back from walking the dog. I only woke up two cups of tea ago *needs tea to function*. -- Fred Gandt (talk|contribs) 13:03, 30 April 2010 (UTC)
OK. Here is a less string replacey way to demonstrate llParseString2List. Nice little Function came out of my efforts if nothing else. Still pushing for a more noob friendly example though. The more input the better.

<lsl>// IMPORTANT!!

// This code is part of an ongoing conversation and is not meant to be considered a good example (yet).

// IMPORTANT!!

string source = "[9:52] Speaker One: Does anyone know what llParseString2List is for? [9:53] Speaker Two: Something to do with cutting strings up and making lists from the parts. [9:53] Speaker Three: Yeah, you can remove bits you don't want too. [9:54] Speaker Four: There are two lists. One is for places to split the string and remove the marker. [9:54] Speaker Four: The other is to split the string and keep the marker. [9:54] Speaker Three: Mhmm. Separators and Spacers in that order. [9:54] Speaker One: So what would I want to do that for? [9:55] Speaker Four: Imagine you had a bunch of info that came to you as a string. [9:55] Speaker Four: You could use llParseString2List to help sort the info out.";

list GetUniqueListEntries(list src) {

   list output = [];
   do
   {
       string entry = llList2String(src, 0);
       src = llParseString2List(llDumpList2String(src, ","), [entry, ","], []);
       output += [entry];
   }
   while(llGetListLength(src));
   return output;

}

default // I have purposefully laid it out this way. It is not supposed to be completely perfect. It's an expressed idea. { // With the addition of notes to explain each stage and then maybe a version with the mass of needless variables removed...

       // ...to show how to configure this sort of process..... *waits for feedback*
   state_entry()
   {
       list separators = ["[", "]  ", "\n", ": "];
       
       list spacers = []; // Still dunno what to use spacers for. I know what they do but trying to think of a demo has me stumped.
       
       list LIST = llParseString2List(source, separators, spacers);
       
       list times = llList2ListStrided(LIST, 0, -1, 3);
       
       list names = llList2ListStrided((LIST = llDeleteSubList(LIST, 0, 0)), 0, -1, 3);
       
       list statements = llList2ListStrided((LIST = llDeleteSubList(LIST, 0, 0)), 0, -1, 3);
       
       string started = llList2String(times, 0);
       
       string ended = llList2String(times, -1);
       
       names = GetUniqueListEntries(names);
       
       llOwnerSay("A conversation between " + ((string)llGetListLength(names)) + " residents - " +
                  llDumpList2String(llList2List(names, 0, -2), ", ") +
                  " and " + llList2String(names, -1) +
                  ", between the times of " + started + " and " + ended +
                  " contained the following statements/questions -\n" +
                  llDumpList2String(statements, "\n"));
   }

} // BTW. It works really well and the GetUniueListEntries function is really fast. UPDATE: This statement is FALSE // At least *coughs* it's great if nobody says anything that inspires a break in the string ;)</lsl>So what does anyone (Strife? Ppaatt?) think? Getting there maybe? -- Fred Gandt (talk|contribs) 20:09, 30 April 2010 (UTC)

That looks really good (have to go to work, bbl). -- Strife (talk|contribs) 21:47, 30 April 2010 (UTC)
Thanks for that but...Under practical situations the GetUniqueListEntries fails badly and the rest of it kinda sucks too. I was working on adding to it but will start afresh I think. I am beginning to think that these functions are not exactly great. Although trying to make this example has forced me to get to grips with them I am actually glad now that I was put off by the current examples all those many months ago when I first looked. I really think you can achieve better results by alternative means (maybe not so quickly but with far more scope). I am continuing to battle on regardless. "Never give up! Never surrender!" -- Fred Gandt (talk|contribs) 01:51, 1 May 2010 (UTC)

<lsl>list GetUniqueListEntries(list src) // Mhmm. {

   list output = [];
   integer index;
   do
   {
       string entry = llList2String(src, 0);
       if(entry != " ") // There is unfortunately a reason for this condition. I'll finish up after some sleep.
       output += [entry];
       while((index = llListFindList(src, [entry])) != -1)
           src = llDeleteSubList(src, index, index);
   }
   while(llGetListLength(src));
   return output;

}</lsl> -- Fred Gandt (talk|contribs) 03:56, 1 May 2010 (UTC)

Not familiar with GetUniqueListEntries. llParseString* functions are good but only for what they are intended for, splitting a string up into a list. They don't work so well when you try to use them for other tasks.

What is wrong with just doing this (and it isn't restricted to being used on just strings): <lsl> list GetUniqueListEntries(list src) {

   list output = [];
   integer index = (src != []);
   while(index--)
   {
       list item = llList2List(src, 0, 0);
       if(!~llListFindList(output, item))
           output += item;
       src = llDeleteSubList(src, 0, 0);
   }
   return output;

}

llDeleteSubList(GetUniqueListEntries(" " + src), 0, 0)

//OR list GetUniqueListEntriesAndIgnore(list src, list ignore) {

   integer ignoring = ignore != [];
   integer counter = src != [];
   while(counter--)
   {
       list item = llList2List(src, 0, 0);
       if(!~llListFindList(output, item))
           ignore += item;
       src = llDeleteSubList(src, 0, 0);
   }
   if(ignoring)
       return llDeleteSubList(ignore, 0, ignoring - 1);
   return ignore;

}

GetUniqueListEntriesAndIgnore(src, [" "]) </lsl> -- Strife (talk|contribs) 05:05, 1 May 2010 (UTC)

Erm. Nothing. I mean, if speed and efficiency is what you want, your version is of course preferable. I wanted to make a slow cumbersome function that lagged and only worked on strings. I think I succeeded. It's funny how sometimes one cannot see the wood for the trees. I would have got it eventually. As for ParseString* I'll just carry on. -- Fred Gandt (talk|contribs) 13:00, 1 May 2010 (UTC)

My previous comment was posted before tests. I think I may slow down a bit and test things before commenting in future. Because -

I don't quite know, well I mean to say, I don't wanna rock the boat or anything. My version is consistently faster than yours by a massive margin. Around 2 to 5 times faster. I'll provide a full documentation once I finish the rest of the script. Oh, I have added the "any list entry type" feature. Although in my defense...this was supposed to be an example script for llParseString* rather than an attempt to make a unique list entries function. I wrote it to handle strings because the script was only ever going to feed string type list entries. Although wearing a crash helmet at ALL times would be advisable when riding a motorbike, when having a bath the advice does kinda suck. I write everything from scratch and to serve the specific purpose rather than considering some future event that has no bearing on the task at hand. Thankfully though, adding the "any list entry type" capability doesn't slow my function down. Awesome! -- Fred Gandt (talk|contribs) 14:19, 1 May 2010 (UTC)
Really? I must be loosing my touch. ._. -- Strife (talk|contribs) 23:09, 1 May 2010 (UTC)
I imagine your reputation will remain intact Strife. I wouldn't worry too much. Maybe I just got lucky. Anyway...as promised a breakdown of tests and results (just in case you're interested). I have actually changed a few things since this morning and the results are more favorable for your code now (purely by chance).

<lsl>string source = "[9:52] Speaker One: Does anyone know what llParseString2List is for? [9:53] Speaker Two: Something to do with cutting strings up and making lists from the parts. [9:53] Speaker Three: Yeah, you can remove bits you don't want too. [9:54] Speaker Four: There are two lists. One is for places to split the string and remove the marker. [9:54] Speaker Four: The other is to split the string and keep the marker. [9:54] Speaker Three: Mhmm. Separators and Spacers in that order. [9:54] Speaker One: So what would I want to do that for? [9:55] Speaker Four: Imagine you had a bunch of info that came to you as a string. [9:55] Speaker Four: You could use llParseString2List to help sort the info out.";

list GetUniqueListEntries(list src) {

   integer index = 0;
   list output = [];
   list entry = [];
   do
   {
       output += (entry = llList2List(src, 0, 0));
       src = llDeleteSubList(src, 0, 0);
       while((index = llListFindList(src, entry)) != -1) /////////  Code 1
       src = llDeleteSubList(src, index, index);
   }
   while(llGetListLength(src));
   return output;

}

//list GetUniqueListEntries(list src) //{ // list output = []; // integer index = (src != []); // while(index--) // { // list item = llList2List(src, 0, 0); ///////////////////// Code 2 // if(!~llListFindList(output, item)) // output += item; // src = llDeleteSubList(src, 0, 0); // } // return output; //}

default {

   state_entry()
   {
       llResetTime(); // Safety first.
       list LIST = llParseString2List(source, ["[", "]  ", "\n", ": "], []);
       list times = llList2ListStrided(LIST, 0, -1, 3);
       list names = GetUniqueListEntries(llList2ListStrided((LIST = llDeleteSubList(LIST, 0, 0)), 0, -1, 3));
       list words = GetUniqueListEntries(
                    llParseString2List(
                    llDumpList2String(
                    llList2ListStrided(
                    llDeleteSubList(LIST, 0, 0),
                    0, -1, 3),
                    " "),
                    [".", ",", "?", " "], []) // Changed this to cut out the spaces making a list 1/2 the length it was.
                    );
       llOwnerSay("A conversation between " +
                  ((string)llGetListLength(names)) +
                  " residents - " +
                  llDumpList2String(llList2List(names, 0, -2), ", ") +
                  " and " +
                  llList2String(names, -1) +
                  ", starting at " +
                  llList2String(times, 0) +
                  " and continuing until " +
                  llList2String(times, -1) +
                  " contained the following 10 words most prevalently -\n" + // I haven't added this bit yet. Might not.
                  llDumpList2String(words, ",") +
                  "\n\n" + ((string)llGetTime()));
   }

}

// When I first tested the two versions of GetUniqueListEntries, the list being fed them was twice the length. // The feed list contained separator spaces as list entries. // The average time of Code 2 was around 7 seconds (one run was nearly 12 seconds). // It seems that a shorter list runs faster through Code 2 (duh). The time is halved as the list is halved. // With Code 1 however, the time changed by only a small amount.

// Run 1 is complile. // All subsequent runs are after manual (click button on script) reset. // Conditions = Favorable (seriously, lag free region).

/////////////////////////////////

// Code 1 ///////////////////////

// 2.288881 run 1 // 3.120935 run 2 // 2.443657 run 3 // 1.209184 run 4 // 2.375422 run 5 // 2.310414 run 6 // 1.906114 run 7 // 1.053586 run 8 // 1.370456 run 9 // 1.479050 run 10

// Mean = 1.9557699 seconds

/////////////////////////////////

// Code 2 ///////////////////////

// 4.238862 run 1 // 3.883345 run 2 // 3.009184 run 3 // 3.968640 run 4 // 3.305787 run 5 // 2.356030 run 6 // 4.081953 run 7 // 3.142955 run 8 // 3.005459 run 9 // 3.343671 run 10

// Mean = 3.4335886 seconds

/////////////////////////////////

// Code 2 took 1.75*Code 1's time</lsl>As for the llParseString2List examples...I don't think this script is working out as being very useful for it. I just need to think of a premise to highlight (as you say) the things it does best, and cut out all the complex stuff (see above *rolls eyes*). So now it's time for me to drink tea and watch Dr. Who. -- Fred Gandt (talk|contribs) 01:39, 2 May 2010 (UTC)