SplitText

Revision as of 20:04, 3 March 2010 by Max (talk | contribs) (→‎SplitText(text,separator): give examples)


SplitText(text,separator,resultIndex)

Splits «text» into a list of substrings at each occurrance of «separator».

SplitText('Bob,Mary,Alice', ',') -> ['Bob', 'Mary', 'Alice']

If «separator» is the empty text, "", it splits «text» into an array of individual characters.

SplitText('AbcdE', ) -> ['A','b','c','d','E']

If «resultIndex» is omitted, the result has an implicit (Null) index. If specified, the result is indexed by «resultIndex».

Index K := [1,2,3]
SplitText('Bob,Mary,Alice', ',', K) -> Array(K, ['Bob', 'Mary', 'Alice'])

If «resultIndex» has more elements than the split text, the extra elements of the result are padded out with Null.

Index L := [1,2,3,4]
SplitText('Bob,Mary,Alice', ',', L) -> Array(L, ['Bob', 'Mary', 'Alice', NULL])

If it has fewer elements, all the remaining substrings are inserted unplit into the last value.

Index M := [1,2]
SplitText('Bob,Mary,Alice', ',', M) -> Array(M, ['Bob', 'Mary,Alice'])

Declaration

SplitText(text,separator:text atom)

Library

Text functions

Examples

SplitText('Al#Bob#Carl','#') → ['Al','Bob','Carl']
SplitText('Al,Bob,Carl',',') → ['Al','Bob','Carl']

Case Sensitivity

When «separator» contains letters, the comparison is done in a case-sensitive fashion.

In Analytica 4.2 or later, you can specify the optional parameter caseInsensitive:true to match in a case insensitive fashion:

SplitText('abAcadAe','a') → [,'bAc','dAe']
SplitText('abAcadAe','a',caseInsensitive:true) → [,'b','c','d','e']

Regular Expressions

New to Analytica 4.2

The «separator» parameter is interpreted as a regular expression when the optional re:true parameter is specified.

The following example splits on any word containing the letter h:

SplitText('Now is the time for all good men to come to the aid of their country', '\s*\w*h\w*\s*', re:1)
  → ['Now is', 'time for all good menu to come to', 'aid of', 'country']

The above pattern identifies zero or more spaces '\s*', followed by zero or more letters '\w*', followed by the letter h, followed by zero or more letters, followed by zero or more spaces.

Using a subpattern, you can require the «separator» pattern to occur within a larger context, without including the larger context within the split point. For example, the following splits at decimal points, but only when they have a numeric digit on each side:

SplitText('17.5 19 1. .5 0.1111', '\d(\.)\d', re:1, subpattern:1 ) → ['17','5 19 1. .5 0', '1111']
SplitText('17.5 19 1. .5 0.1111', '(\d(?<a>\.)\d)|(?<a>\s+)', re:1, subpattern:'a' ) → ['17','5','19','1.','.5','0','1111']

Splitting Arrays

In general, «text» and «separator» cannot be arrays, since the result is unindexed and would result in more than one unindexed array. Therefore, if «text» or «separator» might be array-valued, you will need to index the result using another index.

In Analytica 4.2 or later, you can accomplish this by specifying the result index using the optional parameter «resultIndex». If the result index is longer than the number of items in the result, the remaining entries along the result index are set to «null». If the result index is shorter than the number of split items, then the last item along result index will contain the remainder of the unsplit «text».

Index I := 1..3
Index J := 1..7
SplitText('One two three four five',' ',resultIndex:I) →
I → 1 2 3
'One' 'two' 'three four five'
SplitText('One two three four five',' ',resultIndex:J) →
J → 1 2 3 4 5 6 7
'One' 'two' 'three' 'four' 'five' «null» «null»

In Analytica 4.1 or earlier, the optional «resultIndex» is not available. If you know that all text values in A will have the same number of elements, and index I has this same number of elements, you could use e.g.:

Var A[] := textArray do Array(I,SplitText(s,' '))

When I is not guaranteed to have exactly the same number of items, the Array function will issue a warning, which can be ignored:

Var A[] := textArray do IgnoreWarnings(Array(I,SplitText(s,' ')))

When I has more elements than the result of SplitText, the final elements in the result are padded with «null». When I has fewer elements, then only the first Size(I) split items are retained, e.g.:

Index I := 1..3
Array(I,SplitText('One two three four five',' ')) →
I → 1 2 3
'One' 'two' 'three'

See Also

Comments


You are not allowed to post comments.