Difference between revisions of "SplitText"

 
(12 intermediate revisions by 4 users not shown)
Line 3: Line 3:
 
[[Category:Doc Status C]] <!-- For Lumina use, do not change -->
 
[[Category:Doc Status C]] <!-- For Lumina use, do not change -->
  
= [[SplitText]](text,separator) =
+
== SplitText(text, separator'', resultIndex, re'') ==
 +
Splits «text» into a list of substrings at each occurrence of «separator».
 +
:<code>SplitText('Bob, Mary, Alice', ',') &rarr; ['Bob', 'Mary', 'Alice']</code>
  
Returns a list of substrings from «text» by splitting «text» each time «separator» occurs.
+
If «separator» is the empty text, "", it splits «text» into an array of individual characters.
 +
:<code>SplitText('AbcdE', ") &rarr; ['A', 'b', 'c', 'd', 'E']</code>
  
If «separator» is the empty string, "", the text is split into individual characters.
+
If «text» or «separator» are arrays, you need to specify the «resultIndex». See below for details.
  
= Declaration =
+
== Declaration ==
 
+
:[[SplitText]](text, separator: text atom)
[[SplitText]](text,separator:text atom)
 
 
 
= Library =
 
  
 +
== Library ==
 
Text functions
 
Text functions
  
= Examples =
+
== Examples ==
 
+
:<code>SplitText('Al#Bob#Carl', '#') &rarr; ['Al', 'Bob', 'Carl']</code>
[[SplitText]]('Al#Bob#Carl','#') &rarr; ['Al','Bob','Carl']
+
:<code>SplitText('Al, Bob, Carl', ', ') &rarr; ['Al', 'Bob', 'Carl']</code>
[[SplitText]]('Al,Bob,Carl',',') &rarr; ['Al','Bob','Carl']
 
 
 
= Case Sensitivity =
 
  
 +
== Case Sensitivity ==
 
When «separator» contains letters, the comparison is done in a case-sensitive fashion.  
 
When «separator» contains letters, the comparison is done in a case-sensitive fashion.  
  
In [[What's new in Analytica 4.2?|Analytica 4.2]] or later, you can specify the optional parameter ''caseInsensitive:true'' to match in a case insensitive fashion:
+
:<code>SplitText('abAcadAe', 'a') &rarr; ['', 'bAc', 'dAe']</code>
 +
:<code>SplitText('abAcadAe', 'a', caseInsensitive: true) &rarr; [", 'b', 'c', 'd', 'e']</code>
  
[[SplitText]]('abAcadAe','a') &rarr; ['','bAc','dAe']
+
== Regular Expressions ==
[[SplitText]]('abAcadAe','a',caseInsensitive:true) &rarr; ['','b','c','d','e']
+
The «separator» parameter is interpreted as a [[regular expression]] when the optional <code>re: true</code> parameter is specified.   
 
 
= Regular Expressions =
 
 
 
''New to [[What's new in Analytica 4.2?|Analytica 4.2]]''
 
 
 
The «separator» parameter is interpreted as a [[regular expression]] when the optional ''re:true'' parameter is specified.   
 
  
 
The following example splits on any word containing the letter h:
 
The following example splits on any word containing the letter h:
  
[[SplitText]]('Now is the time for all good men to come to the aid of their country', '\s*\w*h\w*\s*', re:1)
+
:<code>SplitText('Now is the time for all good men to come to the aid of their country', '\s*\w*h\w*\s*', re: 1)</code>
  &rarr; ['Now is', 'time for all good menu to come to', 'aid of', 'country']
+
::<code>&rarr; ['Now is', 'time for all good menu to come to', 'aid of', 'country']</code>
  
The above pattern identifies zero or more spaces '\s*', followed by zero or more letters '\w*', followed by the letter ''h'', followed by zero or more letters, followed by zero or more spaces.
+
The above pattern identifies zero or more spaces <code>'\s*'</code>, followed by zero or more letters <code>'\w*'</code>, followed by the letter ''h'', followed by zero or more letters, followed by zero or more spaces.
  
 
Using a subpattern, you can require the «separator» pattern to occur within a larger context, without including the larger context within the split point.  For example, the following splits at decimal points, but only when they have a numeric digit on each side:
 
Using a subpattern, you can require the «separator» pattern to occur within a larger context, without including the larger context within the split point.  For example, the following splits at decimal points, but only when they have a numeric digit on each side:
[[SplitText]]('17.5 19 1. .5 0.1111', '\d(\.)\d', re:1, subpattern:1 ) &rarr; ['17','5 19 1. .5 0', '1111']
+
:<code>SplitText('17.5 19 1. .5 0.1111', '\d(\.)\d', re:1, subpattern: 1) &rarr; ['17', '5 19 1. .5 0', '1111']</code>
[[SplitText]]('17.5 19 1. .5 0.1111', '(\d(?<a>\.)\d)|(?<a>\s+)', re:1, subpattern:'a' ) &rarr; ['17','5','19','1.','.5','0','1111']
+
:<code>SplitText('17.5 19 1. .5 0.1111', '(\d(?<a>\.)\d)|(?<a>\s+)', re: 1, subpattern:'a') &rarr; ['17', '5', '19', '1.', '.5', '0', '1111']</code>
  
= Splitting Arrays =
+
== Splitting Arrays using «resultIndex» ==
  
In general, «text» and «separator» cannot be arrays, since the result is unindexed and would result in more than one unindexed array. Therefore, if «text» or «separator» might be array-valued, you will need to index the result using another index.
+
If «text» or «separator» are arrays, you need to specify the «resultIndex». Otherwise, it will give an error that you are trying to combine two (or more) arrays with [[implicit index]]es.
  
In [[What's new in Analytica 4.2?|Analytica 4.2]] or later, you can accomplish this by specifying the result index using the optional parameter «resultIndex».  If the result index is longer than the number of items in the result, the remaining entries along the result index are set to [[«null»]].  If the result index is shorter than the number of split items, then the last item along result index will contain the remainder of the unsplit «text».
+
If the result index is longer than the number of items in the result, the remaining entries along the result index are set to [[Null]].  If the result index is shorter than the number of split items, the last item along result index will contain the remainder of the unsplit «text».
  
{| border=0
+
:<code>Index I := 1..3</code>
| [[Index..Do|Index]] I := 1..3  
+
:<code>Index J := 1..7</code>
 +
:<code>SplitText('One two three four five', ' ', resultIndex: I) &rarr;</code>
 +
:{| class="wikitable"
 +
!  colspan="3" | I &#9654;
 
|-
 
|-
| [[Index..Do|Index]] J := 1..7
+
! 1 !! 2 !! 3
 
|-
 
|-
| [[SplitText]]('One two three four five',' ',resultIndex:I) &rarr;
+
| 'One' || 'two' || 'three four five'
|
 
{| border="1"
 
!I &rarr; !! 1 !! 2 !! 3
 
|-
 
| || 'One' || 'two' || 'three four five'
 
 
|}
 
|}
 +
 +
:<code>SplitText('One two three four five',' ', resultIndex: J) &rarr;</code>
 +
:{| class="wikitable"
 +
!  colspan="7" | J &#9654;
 
|-
 
|-
| [[SplitText]]('One two three four five',' ',resultIndex:J) &rarr;
+
! 1 !! 2 !! 3 !! 4 !! 5 !! 6 !! 7
|
 
{| border="1"
 
!I &rarr; !! 1 !! 2 !! 3 !! 4 !! 5 !! 6 !! 7
 
 
|-
 
|-
| || 'One' || 'two' || 'three' || 'four' || 'five' || «null» || «null»
+
| 'One' || 'two' || 'three' || 'four' || 'five' || «null» || «null»
|}
 
 
|}
 
|}
  
In Analytica 4.1 or earlier, the optional «resultIndex» is not available.  If you know that all text values in ''A'' will have the same number of elements, and index ''I'' has this same number of elements, you could use e.g.:
+
In Analytica 4.1 or earlier, the optional «resultIndex» is not available.  If you know that all text values in <code>A</code> will have the same number of elements, and index <code>I</code> has this same number of elements, you could use e.g.:
  
[[Var..Do|Var]] A[] := textArray do [[Array]](I,[[SplitText]](s,' '))
+
:<code>Var A[] := textArray do Array(I, SplitText(s, ' '))</code>
  
When ''I'' is not guaranteed to have exactly the same number of items, the [[Array]] function will issue a warning, which can be ignored:
+
When <code>I</code> is not guaranteed to have exactly the same number of items, the [[Array]] function will issue a warning, which can be ignored:
  
[[Var..Do|Var]] A[] := textArray do [[IgnoreWarnings]]([[Array]](I,[[SplitText]](s,' ')))
+
:<code>Var A[] := textArray do IgnoreWarnings(Array(I, SplitText(s, ' ')))</code>
  
When ''I'' has more elements than the result of [[SplitText]], the final elements in the result are padded with [[«null»]].  When ''I'' has fewer elements, then only the first [[Size]](I) split items are retained, e.g.:
+
When <code>I</code> has more elements than the result of [[SplitText]], the final elements in the result are padded with [[Null]].  When <code>I</code> has fewer elements, then only the first [[Size]](I) split items are retained, e.g.:
  
{| border=0
+
:<code>Index I := 1..3</code>
| [[Index..Do|Index]] I := 1..3  
+
:<code>Array(I, SplitText('One two three four five',' ')) &rarr;</code>
 +
:{| class="wikitable"
 +
!  colspan="3" | I &#9654;
 
|-
 
|-
| [[Array]](I,[[SplitText]]('One two three four five',' ')) &rarr;
+
! 1 !! 2 !! 3
|
 
{| border="1"
 
!I &rarr; !! 1 !! 2 !! 3
 
 
|-
 
|-
| || 'One' || 'two' || 'three'
+
| 'One' || 'two' || 'three'
|}
 
 
|}
 
|}
  
 
+
== See Also ==
= See Also =
 
 
 
 
* [[JoinText]]
 
* [[JoinText]]
* [[SubString]]
+
* [[SelectText]]
 +
* [[Regular Expressions]]
 +
* [[FindInText]]
 +
* [[Text functions]]
 +
* [[Implicit index]]

Latest revision as of 21:24, 4 August 2016


SplitText(text, separator, resultIndex, re)

Splits «text» into a list of substrings at each occurrence of «separator».

SplitText('Bob, Mary, Alice', ',') → ['Bob', 'Mary', 'Alice']

If «separator» is the empty text, "", it splits «text» into an array of individual characters.

SplitText('AbcdE', ") → ['A', 'b', 'c', 'd', 'E']

If «text» or «separator» are arrays, you need to specify the «resultIndex». See below for details.

Declaration

SplitText(text, separator: text atom)

Library

Text functions

Examples

SplitText('Al#Bob#Carl', '#') → ['Al', 'Bob', 'Carl']
SplitText('Al, Bob, Carl', ', ') → ['Al', 'Bob', 'Carl']

Case Sensitivity

When «separator» contains letters, the comparison is done in a case-sensitive fashion.

SplitText('abAcadAe', 'a') → [, 'bAc', 'dAe']
SplitText('abAcadAe', 'a', caseInsensitive: true) → [", 'b', 'c', 'd', 'e']

Regular Expressions

The «separator» parameter is interpreted as a regular expression when the optional re: true parameter is specified.

The following example splits on any word containing the letter h:

SplitText('Now is the time for all good men to come to the aid of their country', '\s*\w*h\w*\s*', re: 1)
→ ['Now is', 'time for all good menu to come to', 'aid of', 'country']

The above pattern identifies zero or more spaces '\s*', followed by zero or more letters '\w*', followed by the letter h, followed by zero or more letters, followed by zero or more spaces.

Using a subpattern, you can require the «separator» pattern to occur within a larger context, without including the larger context within the split point. For example, the following splits at decimal points, but only when they have a numeric digit on each side:

SplitText('17.5 19 1. .5 0.1111', '\d(\.)\d', re:1, subpattern: 1) → ['17', '5 19 1. .5 0', '1111']
SplitText('17.5 19 1. .5 0.1111', '(\d(?<a>\.)\d)|(?<a>\s+)', re: 1, subpattern:'a') → ['17', '5', '19', '1.', '.5', '0', '1111']

Splitting Arrays using «resultIndex»

If «text» or «separator» are arrays, you need to specify the «resultIndex». Otherwise, it will give an error that you are trying to combine two (or more) arrays with implicit indexes.

If the result index is longer than the number of items in the result, the remaining entries along the result index are set to Null. If the result index is shorter than the number of split items, the last item along result index will contain the remainder of the unsplit «text».

Index I := 1..3
Index J := 1..7
SplitText('One two three four five', ' ', resultIndex: I) →
I ▶
1 2 3
'One' 'two' 'three four five'
SplitText('One two three four five',' ', resultIndex: J) →
J ▶
1 2 3 4 5 6 7
'One' 'two' 'three' 'four' 'five' «null» «null»

In Analytica 4.1 or earlier, the optional «resultIndex» is not available. If you know that all text values in A will have the same number of elements, and index I has this same number of elements, you could use e.g.:

Var A[] := textArray do Array(I, SplitText(s, ' '))

When I is not guaranteed to have exactly the same number of items, the Array function will issue a warning, which can be ignored:

Var A[] := textArray do IgnoreWarnings(Array(I, SplitText(s, ' ')))

When I has more elements than the result of SplitText, the final elements in the result are padded with Null. When I has fewer elements, then only the first Size(I) split items are retained, e.g.:

Index I := 1..3
Array(I, SplitText('One two three four five',' ')) →
I ▶
1 2 3
'One' 'two' 'three'

See Also

Comments


You are not allowed to post comments.