Difference between revisions of "Separate Words"

From Second Life Wiki
Jump to navigation Jump to search
(work around the llParseString2List limit of 8 spacers or separators)
 
m (<lsl> tag to <source>)
 
(17 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{LSL Header}}
{{LSL Header}}__NOTOC__
<div id="box">
{{#vardefine:p_src_desc|source string
}}{{#vardefine:p_separators_desc|separators to be discarded
}}{{#vardefine:p_spacers_desc|spacers to be kept}}
== Function: [[list]] separateWords([[string]] {{LSL Param|src}},[[list]] {{LSL Param|separators}},[[list]] {{LSL Param|spacers}}); ==
<div style="padding: 0.5em;">
Returns the words of the {{LSLP|src}} string by keeping the {{LSLP|spacers}}, discarding the {{LSLP|separators}}, and also getting the words in between.


== Separate Words ==
Parameters:
{|
{{LSL DefineRow|[[string]]|src|{{#var:p_src_desc}}}}
{{LSL DefineRow|[[list]]|separators|{{#var:p_separators_desc}}}}
{{LSL DefineRow|[[list]]|spacers|{{#var:p_spacers_desc}}}}
|}


LSL defines [[llParseString2List]] to divide a source string into words. That function sees the chars between separators or spacers, and each spacer, as a word (except that function never sees the empty string as a word).
This separateWords function works like [[llParseString2List]] but accepts many more spacers and separators. If you began by using llParseString2List and then your code grew to involve more than 8 spacers or separators, you might want to call separateWords, keepSpacers, and/or discardSeparators, in place of calling llParseString2List.


Trouble is, LSL doesn't let you choose more than 8 spacers or separators, if you call llParseString2List directly.
This separateWords function does not return the empty words that arguably exist between adjacent separators and spacers. The separateWords and llParseString2List functions do not return such words (''i.e.'', do not return "any null values generated"). The [[llParseStringKeepNulls]] function does return such words.


You can substitute a call to the separateWords function here in place of your call to llParseString2List whenever you have more than 8 separators or more than 8 spacers.
Preconditions to avoid confusion:
# Provide lists of strings as the separators and spacers, not mixed lists of strings and floats and such.
# Don't let any spacer contain or equal a separator, and don't let any separator contain or equal a spacer.
# Do list each spacer and separator only once.


The code follows:
Caveats:


<pre>
Same as llParseString2List, this function returns a list of strings, not a mixed list of strings and floats and such. Cast an entry of the list to the type you need, ''e.g.'', { string words = separateWords(...); integer value = (integer) llList2String(words, index); }. Remember that LSL cast to integer works better than llList2Integer, ''e.g.'', cast to integer understands hexadecimal integer literals such as "0x2A".
// Call llParseString2List for each of the sources.
</div></div>
// Return the results in order.
// cf. http://wiki.secondlife.com/wiki/Separate_Words


list applyLlParseString2List(list sources, list separators, list spacers)
<div id="box">
== Implementation ==
<div style="padding: 0.5em;">
<source lang="lsl2">
// http://wiki.secondlife.com/wiki/Separate_Words
 
// Keep the spacers, discard the separators, and get the words in between, within
// astonishing limits described at http://wiki.secondlife.com/wiki/llParseString2List
 
list keepSpacersDiscardSeparators(list sources, list separators, list spacers)
{
{
     list words = [];
     list words = [];
     integer index;
     integer index;
     integer lenSources = llGetListLength(sources);
     integer sourcing = llGetListLength(sources);
     for (index = 0; index < lenSources; ++index)
 
     for (index = 0; index < sourcing; ++index)
     {
     {
         string source = llList2String(sources, index);
         string source = llList2String(sources, index);
         words += llParseString2List(source, separators, spacers);
         words += llParseString2List(source, separators, spacers);
     }
     }
     return words;
     return words;
}
}


// Divide a source string into words.
// Keep the spacers and get the words in between.
// See the chars between separators or spacers, and each spacer, as a word.
// Never see the empty string as a word.
// cf. http://wiki.secondlife.com/wiki/Separate_Words


list separateWords(string chars, list separators, list spacers)
list keepSpacers(list sources, list spacers)
{
{
   
     list words = sources;
    // Begin with all chars in one word
   
     list words = [chars];
   
    // List the chars between spacers, and each spacer, as a word
   
     integer index;
     integer index;
     integer lenSpacers = llGetListLength(spacers);
     integer spacing = llGetListLength(spacers);
     for (index = 0; index < lenSpacers; index += 8)
     for (index = 0; index < spacing; index += 8)
     {
     {
         list some = llList2List(spacers, index, index + 8 - 1);
         list someSpacers = llList2List(spacers, index, index + 8 - 1);
         words = applyLlParseString2List(words, [], some);
         words = keepSpacersDiscardSeparators(words, [], someSpacers);
     }      
     }
      
     return words;
    // Discard the separators after letting the separators separate words
}
   
 
//  integer index;
// Discard the separators but get the words in between.
     integer lenSeparators = llGetListLength(separators);
 
     for (index = 0; index < lenSeparators; index += 8)
list discardSeparators(list sources, list separators)
{
    list words = sources;
    integer index;
     integer separating = llGetListLength(separators);
     for (index = 0; index < separating; index += 8)
     {
     {
         list some = llList2List(separators, index, index + 8 - 1);
         list someSeparators = llList2List(separators, index, index + 8 - 1);
         words = applyLlParseString2List(words, some, []);
         words = keepSpacersDiscardSeparators(words, someSeparators, []);
     }
     }
   
    // Succeed
       
     return words;
     return words;
}
}


// Demo
// Keep the spacers and discard the separators and get the words in between.
 
list separateWords(string src, list separators, list spacers)
{
    return discardSeparators(keepSpacers([src], spacers), separators);
}
</source>
</div></div>
 
<div id="box">
 
== Demo ==
<div style="padding: 0.5em;">
Asking to keep the spacers, discard the separators, and get the words between out of this '''{{LSL Param|src}}''':<br/>
<br/>
42 0.99 "00000000-0000-0000-0000-000000000000" [abc, def] "xyz\\"zyx ijk" <0, 1, 2, 3> // source literals OK<br/>
<br/>
'''says:'''<br/>
0: 42<br/>
1: 0.99<br/>
2: "<br/>
3: 00000000<br/>
4: -<br/>
5: 0000<br/>
6: -<br/>
7: 0000<br/>
8: -<br/>
9: 0000<br/>
10: -<br/>
11: 000000000000<br/>
12: "<br/>
13: [<br/>
14: abc<br/>
15: def<br/>
16: ]<br/>
17: "<br/>
18: xyz<br/>
19: \<br/>
20: \<br/>
21: "<br/>
22: zyx<br/>
23: ijk<br/>
24: "<br/>
25: <<br/>
26: 0<br/>
27: 1<br/>
28: 2<br/>
29: 3<br/>
30: ><br/>
31: /<br/>
32: /<br/>
33: source<br/>
34: literals<br/>
42 0.99 "00000000-0000-0000-0000-000000000000" [abc, def] "xyz\\"zyx ijk" <0, 1, 2, 3> // source literals
OK


string lf = "\n";
<source lang="lsl2">
string quote = "\"";
string escape = "\\";


list spacers = [quote, "(", ")", "<", ">", "[", "]", "/", "+", "-", "*", "%", escape];
// Demo keeping the spacers, discarding the separators, and getting the words in between.
 
string src()
{
    return "42 0.99 \"00000000-0000-0000-0000-000000000000\"" +
        " [abc, def] \"xyz\\\\\"zyx ijk\" <0, 1, 2, 3> // source literals";
}
 
string LF = "\n";
string DQUOTE = "\""; // double quote
string ESCAPE = "\\";
 
list spacers = [DQUOTE, "(", ")", "<", ">", "[", "]", "/", "+", "-", "*", "%", ESCAPE];


list separators()
list separators()
{
{
     string tab = llUnescapeURL("%09"); // != "\t"
     string TAB = llUnescapeURL("%09"); // != "\t"
     string cr = llUnescapeURL("%0D"); // != "\r"
     string CR = llUnescapeURL("%0D"); // != "\r"
     return [tab, lf, cr, " ", ",", ";"];
     return [TAB, LF, CR, " ", ",", ";"];
}
 
ownerSayStrings(list strings)
{
    integer stringing = llGetListLength(strings);
    integer index;
    for (index = 0; index < stringing; ++index)
    {
        llOwnerSay((string) index + ": " + llList2String(strings, index));
    }       
}
}


Line 85: Line 177:
     state_entry()
     state_entry()
     {
     {
       
         list words = separateWords(src(), separators(), spacers);
        string chars = "42 0.99 \"00000000-0000-0000-0000-000000000000\" [abc, def] \"xyz\\\\\"zyx\" <0, 1, 2, 3> // source literals";
         ownerSayStrings(words);
         list words = separateWords(chars, separators(), spacers);
         llOwnerSay(src());
 
        integer index;
         integer lenWords = llGetListLength(words);
         for (index = 0; index < lenWords; ++index)
        {
            llOwnerSay((string) index + ": " + llList2String(words, index));
        }
       
         llOwnerSay("OK");
         llOwnerSay("OK");
     }
     }
}
}
</pre>
</source>
</div></div>
 
<div id="box">
 
== See Also ==
<div style="padding: 0.5em">
'''Functions'''
* [[LlParseString2List]]
* [[llDumpList2String]]
* [[llCSV2List]]
* [[llList2CSV]]
* [[LlParseStringKeepNulls]]
 
'''Script Library'''
* [[ParseString2List]]
 
'''Implementation Differences'''
* [[LSLEditorBugs]]
</div></div>


[[Category:LSL Examples]]
{{LSLC|Examples|Separate Words}}

Latest revision as of 17:58, 24 January 2015

Function: list separateWords(string src,list separators,list spacers);

Returns the words of the src string by keeping the spacers, discarding the separators, and also getting the words in between.

Parameters:

• string src source string
• list separators separators to be discarded
• list spacers spacers to be kept

This separateWords function works like llParseString2List but accepts many more spacers and separators. If you began by using llParseString2List and then your code grew to involve more than 8 spacers or separators, you might want to call separateWords, keepSpacers, and/or discardSeparators, in place of calling llParseString2List.

This separateWords function does not return the empty words that arguably exist between adjacent separators and spacers. The separateWords and llParseString2List functions do not return such words (i.e., do not return "any null values generated"). The llParseStringKeepNulls function does return such words.

Preconditions to avoid confusion:

  1. Provide lists of strings as the separators and spacers, not mixed lists of strings and floats and such.
  2. Don't let any spacer contain or equal a separator, and don't let any separator contain or equal a spacer.
  3. Do list each spacer and separator only once.

Caveats:

Same as llParseString2List, this function returns a list of strings, not a mixed list of strings and floats and such. Cast an entry of the list to the type you need, e.g., { string words = separateWords(...); integer value = (integer) llList2String(words, index); }. Remember that LSL cast to integer works better than llList2Integer, e.g., cast to integer understands hexadecimal integer literals such as "0x2A".

Implementation

// http://wiki.secondlife.com/wiki/Separate_Words

// Keep the spacers, discard the separators, and get the words in between, within
// astonishing limits described at http://wiki.secondlife.com/wiki/llParseString2List

list keepSpacersDiscardSeparators(list sources, list separators, list spacers)
{
    list words = [];

    integer index;
    integer sourcing = llGetListLength(sources);

    for (index = 0; index < sourcing; ++index)
    {
        string source = llList2String(sources, index);
        words += llParseString2List(source, separators, spacers);

    }
    return words;
}

// Keep the spacers and get the words in between.

list keepSpacers(list sources, list spacers)
{
    list words = sources;
    integer index;
    integer spacing = llGetListLength(spacers);
    for (index = 0; index < spacing; index += 8)
    {
        list someSpacers = llList2List(spacers, index, index + 8 - 1);
        words = keepSpacersDiscardSeparators(words, [], someSpacers);
    }
    return words;
}

// Discard the separators but get the words in between.

list discardSeparators(list sources, list separators)
{
    list words = sources;
    integer index;
    integer separating = llGetListLength(separators);
    for (index = 0; index < separating; index += 8)
    {
        list someSeparators = llList2List(separators, index, index + 8 - 1);
        words = keepSpacersDiscardSeparators(words, someSeparators, []);
    }
    return words;
}

// Keep the spacers and discard the separators and get the words in between.

list separateWords(string src, list separators, list spacers)
{
    return discardSeparators(keepSpacers([src], spacers), separators);
}

Demo

Asking to keep the spacers, discard the separators, and get the words between out of this src:

42 0.99 "00000000-0000-0000-0000-000000000000" [abc, def] "xyz\\"zyx ijk" <0, 1, 2, 3> // source literals OK

says:
0: 42
1: 0.99
2: "
3: 00000000
4: -
5: 0000
6: -
7: 0000
8: -
9: 0000
10: -
11: 000000000000
12: "
13: [
14: abc
15: def
16: ]
17: "
18: xyz
19: \
20: \
21: "
22: zyx
23: ijk
24: "
25: <
26: 0
27: 1
28: 2
29: 3
30: >
31: /
32: /
33: source
34: literals
42 0.99 "00000000-0000-0000-0000-000000000000" [abc, def] "xyz\\"zyx ijk" <0, 1, 2, 3> // source literals OK

// Demo keeping the spacers, discarding the separators, and getting the words in between.

string src()
{
    return "42 0.99 \"00000000-0000-0000-0000-000000000000\"" +
        " [abc, def] \"xyz\\\\\"zyx ijk\" <0, 1, 2, 3> // source literals";
}

string LF = "\n";
string DQUOTE = "\""; // double quote
string ESCAPE = "\\";

list spacers = [DQUOTE, "(", ")", "<", ">", "[", "]", "/", "+", "-", "*", "%", ESCAPE];

list separators()
{
    string TAB = llUnescapeURL("%09"); // != "\t"
    string CR = llUnescapeURL("%0D"); // != "\r"
    return [TAB, LF, CR, " ", ",", ";"];
}

ownerSayStrings(list strings)
{
    integer stringing = llGetListLength(strings);
    integer index;
    for (index = 0; index < stringing; ++index)
    {
        llOwnerSay((string) index + ": " + llList2String(strings, index));
    }        
}

default
{
    state_entry()
    {
        list words = separateWords(src(), separators(), spacers);
        ownerSayStrings(words);
        llOwnerSay(src());
        llOwnerSay("OK");
    }
}