Difference between revisions of "Separate Words"

From Second Life Wiki
Jump to navigation Jump to search
m
(factor out keepSpacers and discardSeparators, add caveats and preconditions, strike the dangling See Also link)
Line 4: Line 4:
}}{{#vardefine:p_separators_desc|separators to be discarded
}}{{#vardefine:p_separators_desc|separators to be discarded
}}{{#vardefine:p_spacers_desc|spacers to be kept}}
}}{{#vardefine:p_spacers_desc|spacers to be kept}}
== Function: [[list]] SeparateWords([[string]] {{LSL Param|src}},[[string]] {{LSL Param|separators}},[[string]] {{LSL Param|spacers}}); ==
== Function: [[list]] separateWords([[string]] {{LSL Param|src}},[[list]] {{LSL Param|separators}},[[list]] {{LSL Param|spacers}}); ==
<div style="padding: 0.5em;">
<div style="padding: 0.5em;">
Returns a list that is '''{{LSL Param|src}}''' broken into a list, discarding '''{{LSL Param|separators}}''', keeping '''{{LSL Param|spacers}}''', discards any null values generated. Same as LSL funtion [[llParseString2List]], but not limited to 8 spacers or separators.
Returns the words of the '''{{LSL Param|src}}''' string by keeping the '''{{LSL Param|spacers}}''', discarding the '''{{LSL Param|separators}}''', and also getting the words in between.


Thus you may substitute a call to the [[llParseString2List]] function by a call to SeparateWords whenever you have more than 8 separators or more than 8 spacers.
Parameters:
{|
{|
{{LSL DefineRow|[[string]]|src|{{#var:p_src_desc}}}}
{{LSL DefineRow|[[string]]|src|{{#var:p_src_desc}}}}
Line 15: Line 15:
|}
|}


'''{{LSL Param|separators}}''' and '''{{LSL Param|spacers}}''' must be lists of strings.
This separateWords function works like [[llParseString2List]] but accepts many more spacers and separators. If you began by using llParseString2List and then your code grew to involve more than 8 spacers or separators, you might want to call separateWords, keepSpacers, and/or discardSeparators, in place of calling llParseString2List.


To avoid contradiction, every string of the '''{{LSL Param|spacers}}''' list of strings to keep must not exist in the '''{{LSL Param|separators}}''' list of strings to discard.
This separateWords function does not return the empty words that arguably exist between adjacent separators and spacers. The separateWords and llParseString2List functions do not return such words (''i.e.'', do not return "any null values generated"). The [[llParseStringKeepNulls]] function does return such words.
 
Preconditions to avoid confusion:
# Provide lists of strings as the separators and spacers, not mixed lists of strings and floats and such.
# Don't let any spacer contain or equal a separator, and don't let any separator contain or equal a spacer.
# Do list each spacer and separator only once.
 
Caveats:
 
Same as llParseString2List, this function returns a list of strings, not a mixed list of strings and floats and such. Cast an entry of the list to the type you need, ''e.g.'', { string words = separateWords(...); integer value = (integer) llList2String(words, index); }. Remember that LSL cast to integer works better than llList2Integer, ''e.g.'', cast to integer understands hexadecimal integer literals such as 0x2A.
</div></div>
</div></div>


<div id="box">
<div id="box">
== See Also ==
== Implementation ==
<div style="padding: 0.5em;">
<div style="padding: 0.5em;">
See also the parseString2List function among the useful snippets of the [[llParseString2List]] article.
<pre>
</div>
// http://wiki.secondlife.com/wiki/Separate_Words
</div>


<div id="box">
// Keep the spacers, discard the separators, and get the words in between, within
== Specification ==
// astonishing limits described at http://wiki.secondlife.com/wiki/llParseString2List
<div style="padding: 0.5em;">
 
<pre>
list keepSpacersDiscardSeparators(list sources, list separators, list spacers)
list applyLlParseString2List(list sources, list separators, list spacers)
{
{
     list words = [];
     list words = [];
     integer lenSources = llGetListLength(sources);
 
    integer i = 0;
    integer index;
     for (; i < lenSources; ++i)
     integer sourcing = llGetListLength(sources);
 
     for (index = 0; index < sourcing; ++index)
     {
     {
         string source = llList2String(sources, i);
         string source = llList2String(sources, index);
         words += llParseString2List(source, separators, spacers);
         words += llParseString2List(source, separators, spacers);
     }
     }
     return words;
     return words;
}
}


// Divide a source string into words
// Keep the spacers and get the words in between.
// See the chars between separators or spacers, and each spacer, as a word
// Never see the empty string as a word


list SeparateWords(string src, list separators, list spacers)
list keepSpacers(list sources, list spacers)
{
{
   
     list words = sources;
    // Begin with all chars in one word
     integer index;
   
     integer spacing = llGetListLength(spacers);
     list words = (list)src;
     for (index = 0; index < spacing; index += 8)
      
    // List the chars between spacers, and each spacer, as a word
   
     integer lenSpacers = llGetListLength(spacers);
     integer i = 0;
    for (; i < lenSpacers; i += 8)
     {
     {
         list some = llList2List(spacers, i, i + 7);
         list someSpacers = llList2List(spacers, index, index + 8 - 1);
         words = applyLlParseString2List(words, [], some);
         words = keepSpacersDiscardSeparators(words, [], someSpacers);
     }      
     }
      
     return words;
    // Discard the separators after letting the separators separate words
}
   
 
     integer lenSeparators = llGetListLength(separators);
// Discard the separators but get the words in between.
     for (i = 0; i < lenSeparators; i += 8)
 
list discardSeparators(list sources, list separators)
{
    list words = sources;
    integer index;
     integer separating = llGetListLength(separators);
     for (index = 0; index < separating; index += 8)
     {
     {
         list some = llList2List(separators, i, i + 7);
         list someSeparators = llList2List(separators, index, index + 8 - 1);
         words = applyLlParseString2List(words, some, []);
         words = keepSpacersDiscardSeparators(words, someSeparators, []);
     }
     }
   
    // Succeed
       
     return words;
     return words;
}
// Keep the spacers and discard the separators and get the words in between.
list separateWords(string src, list separators, list spacers)
{
    return discardSeparators(keepSpacers([src], spacers), separators);
}
}
</pre>
</pre>
Line 83: Line 95:
<div id="box">
<div id="box">


== Example ==
== Demo ==
<div style="padding: 0.5em;">
<div style="padding: 0.5em;">
Example to separate this '''{{LSL Param|src}}''': <br/>
Asking to keep the spacers, discard the separators, and get the words between out of this '''{{LSL Param|src}}''':<br/>
''42 0.99 \"00000000-0000-0000-0000-000000000000\" [abc, def] \"xyz\\\\\"zyx\" <0, 1, 2, 3> // source literals''<br/>
<br/>
''42 0.99 \"00000000-0000-0000-0000-000000000000\" [abc, def] \"xyz\\\\\"zyx ijk\" <0, 1, 2, 3> // source literals''<br/>
<br/>
<br/>
'''Says:'''<br/>
'''says:'''<br/>
0: 42<br/>
0: 42<br/>
1: 0.99<br/>
1: 0.99<br/>
Line 112: Line 125:
21: "<br/>
21: "<br/>
22: zyx<br/>
22: zyx<br/>
23: "<br/>
23: ijk<br/>
24: <<br/>
24: "<br/>
25: 0<br/>
25: <<br/>
26: 1<br/>
26: 0<br/>
27: 2<br/>
27: 1<br/>
28: 3<br/>
28: 2<br/>
29: ><br/>
29: 3<br/>
30: /<br/>
30: ><br/>
31: /<br/>
31: /<br/>
32: source<br/>
32: /<br/>
33: literals<br/>
33: source<br/>
34: literals<br/>
OK
OK


<pre>
<pre>
string lf = "\n";
// Demo keeping the spacers, discarding the separators, and getting the words in between.
string quote = "\"";
 
string escape = "\\";
string LF = "\n";
string DQUOTE = "\""; // double quote
string ESCAPE = "\\";


list spacers = [quote, "(", ")", "<", ">", "[", "]", "/", "+", "-", "*", "%", escape];
list spacers = [DQUOTE, "(", ")", "<", ">", "[", "]", "/", "+", "-", "*", "%", ESCAPE];


list separators()
list separators()
{
{
     string tab = llUnescapeURL("%09"); // != "\t"
     string TAB = llUnescapeURL("%09"); // != "\t"
     string cr = llUnescapeURL("%0D"); // != "\r"
     string CR = llUnescapeURL("%0D"); // != "\r"
     return [tab, lf, cr, " ", ",", ";"];
     return [TAB, LF, CR, " ", ",", ";"];
}
 
ownerSayStrings(list strings)
{
    integer stringing = llGetListLength(strings);
    integer index;
    for (index = 0; index < stringing; ++index)
    {
        llOwnerSay((string) index + ": " + llList2String(strings, index));
    }       
}
}


Line 142: Line 168:
{
{
     state_entry()
     state_entry()
     {
     {      
       
         string chars = "42 0.99 \"00000000-0000-0000-0000-000000000000\"  
         string chars = "42 0.99 \"00000000-0000-0000-0000-000000000000\"  
        [abc, def] \"xyz\\\\\"zyx\" <0, 1, 2, 3> // source literals";
            [abc, def] \"xyz\\\\\"zyx ijk\" <0, 1, 2, 3> // source literals";
         list words = SeparateWords(chars, separators(), spacers);
         list words = separateWords(chars, separators(), spacers);
 
         ownerSayStrings(words);
        integer lenWords = llGetListLength(words);
        integer i = 0;
        for (; i < lenWords; ++i)
         {
            llOwnerSay((string) i + ": " + llList2String(words, i));
        }
       
         llOwnerSay("OK");
         llOwnerSay("OK");
     }
     }
Line 166: Line 184:
<div style="padding: 0.5em">
<div style="padding: 0.5em">
'''Functions'''
'''Functions'''
* [[LlParseString2List|llParseString2List]]
* [[LlParseString2List]]
* [[llDumpList2String]]
* [[llCSV2List]]
* [[llList2CSV]]
* [[LlParseStringKeepNulls]]
 
'''Implementation Differences'''
'''Implementation Differences'''
* [[LSLEditorBugs]]
* [[LSLEditorBugs]]

Revision as of 17:58, 27 September 2007

Function: list separateWords(string src,list separators,list spacers);

Returns the words of the src string by keeping the spacers, discarding the separators, and also getting the words in between.

Parameters:

• string src source string
• list separators separators to be discarded
• list spacers spacers to be kept

This separateWords function works like llParseString2List but accepts many more spacers and separators. If you began by using llParseString2List and then your code grew to involve more than 8 spacers or separators, you might want to call separateWords, keepSpacers, and/or discardSeparators, in place of calling llParseString2List.

This separateWords function does not return the empty words that arguably exist between adjacent separators and spacers. The separateWords and llParseString2List functions do not return such words (i.e., do not return "any null values generated"). The llParseStringKeepNulls function does return such words.

Preconditions to avoid confusion:

  1. Provide lists of strings as the separators and spacers, not mixed lists of strings and floats and such.
  2. Don't let any spacer contain or equal a separator, and don't let any separator contain or equal a spacer.
  3. Do list each spacer and separator only once.

Caveats:

Same as llParseString2List, this function returns a list of strings, not a mixed list of strings and floats and such. Cast an entry of the list to the type you need, e.g., { string words = separateWords(...); integer value = (integer) llList2String(words, index); }. Remember that LSL cast to integer works better than llList2Integer, e.g., cast to integer understands hexadecimal integer literals such as 0x2A.

Implementation

// http://wiki.secondlife.com/wiki/Separate_Words

// Keep the spacers, discard the separators, and get the words in between, within
// astonishing limits described at http://wiki.secondlife.com/wiki/llParseString2List

list keepSpacersDiscardSeparators(list sources, list separators, list spacers)
{
    list words = [];

    integer index;
    integer sourcing = llGetListLength(sources);

    for (index = 0; index < sourcing; ++index)
    {
        string source = llList2String(sources, index);
        words += llParseString2List(source, separators, spacers);

    }
    return words;
}

// Keep the spacers and get the words in between.

list keepSpacers(list sources, list spacers)
{
    list words = sources;
    integer index;
    integer spacing = llGetListLength(spacers);
    for (index = 0; index < spacing; index += 8)
    {
        list someSpacers = llList2List(spacers, index, index + 8 - 1);
        words = keepSpacersDiscardSeparators(words, [], someSpacers);
    }
    return words;
}

// Discard the separators but get the words in between.

list discardSeparators(list sources, list separators)
{
    list words = sources;
    integer index;
    integer separating = llGetListLength(separators);
    for (index = 0; index < separating; index += 8)
    {
        list someSeparators = llList2List(separators, index, index + 8 - 1);
        words = keepSpacersDiscardSeparators(words, someSeparators, []);
    }
    return words;
}

// Keep the spacers and discard the separators and get the words in between.

list separateWords(string src, list separators, list spacers)
{
    return discardSeparators(keepSpacers([src], spacers), separators);
}

Demo

Asking to keep the spacers, discard the separators, and get the words between out of this src:

42 0.99 \"00000000-0000-0000-0000-000000000000\" [abc, def] \"xyz\\\\\"zyx ijk\" <0, 1, 2, 3> // source literals

says:
0: 42
1: 0.99
2: "
3: 00000000
4: -
5: 0000
6: -
7: 0000
8: -
9: 0000
10: -
11: 000000000000
12: "
13: [
14: abc
15: def
16: ]
17: "
18: xyz
19: \
20: \
21: "
22: zyx
23: ijk
24: "
25: <
26: 0
27: 1
28: 2
29: 3
30: >
31: /
32: /
33: source
34: literals
OK

// Demo keeping the spacers, discarding the separators, and getting the words in between.

string LF = "\n";
string DQUOTE = "\""; // double quote
string ESCAPE = "\\";

list spacers = [DQUOTE, "(", ")", "<", ">", "[", "]", "/", "+", "-", "*", "%", ESCAPE];

list separators()
{
    string TAB = llUnescapeURL("%09"); // != "\t"
    string CR = llUnescapeURL("%0D"); // != "\r"
    return [TAB, LF, CR, " ", ",", ";"];
}

ownerSayStrings(list strings)
{
    integer stringing = llGetListLength(strings);
    integer index;
    for (index = 0; index < stringing; ++index)
    {
        llOwnerSay((string) index + ": " + llList2String(strings, index));
    }        
}

default
{
    state_entry()
    {        
        string chars = "42 0.99 \"00000000-0000-0000-0000-000000000000\" 
            [abc, def] \"xyz\\\\\"zyx ijk\" <0, 1, 2, 3> // source literals";
        list words = separateWords(chars, separators(), spacers);
        ownerSayStrings(words);
        llOwnerSay("OK");
    }
}