Difference between revisions of "SplitText"

m (adding doc status category)
Line 3: Line 3:
 
[[Category:Doc Status C]] <!-- For Lumina use, do not change -->
 
[[Category:Doc Status C]] <!-- For Lumina use, do not change -->
  
Returns a list of substrings from S by splitting S each time separator occurs.
+
= [[SplitText]](text,separator) =
 +
 
 +
Returns a list of substrings from «text» by splitting «text» each time «separator» occurs.
 +
 
 +
If «separator» is the empty string, "", the text is split into individual characters.
  
 
= Declaration =
 
= Declaration =
  
  SplitText(S,Separator:textual atomic)
+
  [[SplitText]](text,separator:text atom)
 +
 
 +
= Library =
 +
 
 +
Text functions
  
 
= Examples =
 
= Examples =
  
  SplitText('Al#Bob#Carl','#') --> ['Al','Bob','Carl']
+
  [[SplitText]]('Al#Bob#Carl','#') &rarr; ['Al','Bob','Carl']
 +
[[SplitText]]('Al,Bob,Carl',',') &rarr; ['Al','Bob','Carl']
 +
 
 +
= Case Sensitivity =
 +
 
 +
When «separator» contains letters, the comparison is done in a case-sensitive fashion.
 +
 
 +
In [[What's new in Analytica 4.2?|Analytica 4.2]] or later, you can specify the optional parameter ''caseInsensitive:true'' to match in a case insensitive fashion:
 +
 
 +
[[SplitText]]('abAcadAe','a') &rarr; ['','bAc','dAe']
 +
[[SplitText]]('abAcadAe','a',caseInsensitive:true) &rarr; ['','b','c','d','e']
 +
 
 +
= Regular Expressions =
 +
 
 +
''New to [[What's new in Analytica 4.2?|Analytica 4.2]]''
 +
 
 +
The «separator» parameter is interpreted as a [[regular expression]] when the optional ''re:true'' parameter is specified. 
 +
 
 +
The following example splits on any word containing the letter h:
 +
 
 +
[[SplitText]]('Now is the time for all good men to come to the aid of their country', '\s*\w*h\w*\s*', re:1)
 +
  &rarr; ['Now is', 'time for all good menu to come to', 'aid of', 'country']
 +
 
 +
The above pattern identifies zero or more spaces '\s*', followed by zero or more letters '\w*', followed by the letter ''h'', followed by zero or more letters, followed by zero or more spaces.
 +
 
 +
Using a subpattern, you can require the «separator» pattern to occur within a larger context, without including the larger context within the split point.  For example, the following splits at decimal points, but only when they have a numeric digit on each side:
 +
[[SplitText]]('17.5 19 1. .5 0.1111', '\d(\.)\d', re:1, subpattern:1 ) &rarr; ['17','5 19 1. .5 0', '1111']
 +
[[SplitText]]('17.5 19 1. .5 0.1111', '(\d(?<a>\.)\d)|(?<a>\s+)', re:1, subpattern:'a' ) &rarr; ['17','5','19','1.','.5','0','1111']
 +
 
 +
= Splitting Arrays =
 +
 
 +
In general, «text» and «separator» cannot be arrays, since the result is unindexed and would result in more than one unindexed array.  Therefore, if «text» or «separator» might be array-valued, you will need to index the result using another index.
 +
 
 +
In [[What's new in Analytica 4.2?|Analytica 4.2]] or later, you can accomplish this by specifying the result index using the optional parameter «resultIndex».  If the result index is longer than the number of items in the result, the remaining entries along the result index are set to [[«null»]].  If the result index is shorter than the number of split items, then the last item along result index will contain the remainder of the unsplit «text».
 +
 
 +
{| border=0
 +
| [[Index..Do|Index]] I := 1..3
 +
|-
 +
| [[Index..Do|Index]] J := 1..7
 +
|-
 +
| [[SplitText]]('One two three four five',' ',resultIndex:I) &rarr;
 +
|
 +
{| border="1"
 +
!I &rarr; !! 1 !! 2 !! 3
 +
|-
 +
| || 'One' || 'two' || 'three four five'
 +
|}
 +
|-
 +
| [[SplitText]]('One two three four five',' ',resultIndex:J) &rarr;
 +
|
 +
{| border="1"
 +
!I &rarr; !! 1 !! 2 !! 3 !! 4 !! 5 !! 6 !! 7
 +
|-
 +
| || 'One' || 'two' || 'three' || 'four' || 'five' || «null» || «null»
 +
|}
 +
|}
 +
 
 +
In Analytica 4.1 or earlier, the optional «resultIndex» is not available.  If you know that all text values in ''A'' will have the same number of elements, and index ''I'' has this same number of elements, you could use e.g.:
 +
 
 +
[[Var..Do|Var]] A[] := textArray do [[Array]](I,[[SplitText]](s,' '))
 +
 
 +
When ''I'' is not guaranteed to have exactly the same number of items, the [[Array]] function will issue a warning, which can be ignored:
  
= Detail Description =
+
[[Var..Do|Var]] A[] := textArray do [[IgnoreWarnings]]([[Array]](I,[[SplitText]](s,' ')))
  
If Separator is the empty string, "", the string is split into individual characters.
+
When ''I'' has more elements than the result of [[SplitText]], the final elements in the result are padded with [[«null»]].  When ''I'' has fewer elements, then only the first [[Size]](I) split items are retained, e.g.:
  
In general, S and Separator cannot be arrays, since the result is unindexed and would result in more than one unindexed array. Therefore, if S or Separator might be an array, you will need to re-index the result using another index.  For example, if you know that all strings will have the same number of elements, and Index I has this same number of elements, you could use:
+
{| border=0
 +
| [[Index..Do|Index]] I := 1..3
 +
|-
 +
| [[Array]](I,[[SplitText]]('One two three four five',' ')) &rarr;
 +
|
 +
{| border="1"
 +
!I &rarr; !! 1 !! 2 !! 3
 +
|-
 +
| || 'One' || 'two' || 'three'
 +
|}
 +
|}
  
var s1[] := S do Array(I,SplitText(s1,','))
 
  
 
= See Also =
 
= See Also =

Revision as of 17:08, 23 February 2009


SplitText(text,separator)

Returns a list of substrings from «text» by splitting «text» each time «separator» occurs.

If «separator» is the empty string, "", the text is split into individual characters.

Declaration

SplitText(text,separator:text atom)

Library

Text functions

Examples

SplitText('Al#Bob#Carl','#') → ['Al','Bob','Carl']
SplitText('Al,Bob,Carl',',') → ['Al','Bob','Carl']

Case Sensitivity

When «separator» contains letters, the comparison is done in a case-sensitive fashion.

In Analytica 4.2 or later, you can specify the optional parameter caseInsensitive:true to match in a case insensitive fashion:

SplitText('abAcadAe','a') → [,'bAc','dAe']
SplitText('abAcadAe','a',caseInsensitive:true) → [,'b','c','d','e']

Regular Expressions

New to Analytica 4.2

The «separator» parameter is interpreted as a regular expression when the optional re:true parameter is specified.

The following example splits on any word containing the letter h:

SplitText('Now is the time for all good men to come to the aid of their country', '\s*\w*h\w*\s*', re:1)
  → ['Now is', 'time for all good menu to come to', 'aid of', 'country']

The above pattern identifies zero or more spaces '\s*', followed by zero or more letters '\w*', followed by the letter h, followed by zero or more letters, followed by zero or more spaces.

Using a subpattern, you can require the «separator» pattern to occur within a larger context, without including the larger context within the split point. For example, the following splits at decimal points, but only when they have a numeric digit on each side:

SplitText('17.5 19 1. .5 0.1111', '\d(\.)\d', re:1, subpattern:1 ) → ['17','5 19 1. .5 0', '1111']
SplitText('17.5 19 1. .5 0.1111', '(\d(?<a>\.)\d)|(?<a>\s+)', re:1, subpattern:'a' ) → ['17','5','19','1.','.5','0','1111']

Splitting Arrays

In general, «text» and «separator» cannot be arrays, since the result is unindexed and would result in more than one unindexed array. Therefore, if «text» or «separator» might be array-valued, you will need to index the result using another index.

In Analytica 4.2 or later, you can accomplish this by specifying the result index using the optional parameter «resultIndex». If the result index is longer than the number of items in the result, the remaining entries along the result index are set to «null». If the result index is shorter than the number of split items, then the last item along result index will contain the remainder of the unsplit «text».

Index I := 1..3
Index J := 1..7
SplitText('One two three four five',' ',resultIndex:I) →
I → 1 2 3
'One' 'two' 'three four five'
SplitText('One two three four five',' ',resultIndex:J) →
I → 1 2 3 4 5 6 7
'One' 'two' 'three' 'four' 'five' «null» «null»

In Analytica 4.1 or earlier, the optional «resultIndex» is not available. If you know that all text values in A will have the same number of elements, and index I has this same number of elements, you could use e.g.:

Var A[] := textArray do Array(I,SplitText(s,' '))

When I is not guaranteed to have exactly the same number of items, the Array function will issue a warning, which can be ignored:

Var A[] := textArray do IgnoreWarnings(Array(I,SplitText(s,' ')))

When I has more elements than the result of SplitText, the final elements in the result are padded with «null». When I has fewer elements, then only the first Size(I) split items are retained, e.g.:

Index I := 1..3
Array(I,SplitText('One two three four five',' ')) →
I → 1 2 3
'One' 'two' 'three'


See Also

Comments


You are not allowed to post comments.