Text functions

Revision as of 23:52, 29 December 2015 by Bbecane (talk | contribs)

These functions work with text values (sometimes known as strings), available in the built-in Text library.

Asc(t)

Returns the ASCII code (a number between 0 and 255) of the first character in text value «t». This is occasionally useful, for example to understand the alphabetic ordering of text values.

See also Asc(t).

Chr(n)

Returns the character corresponding to the numeric ASCII code «n» (a number between 0 and 255). Chr() and Asc() are inverses of each other, for example:

Chr(65) → 'A', Asc(Chr(65)) → 65
Asc('A') → 65, Chr(Asc('A')) → 'A'

Chr() is useful for creating characters that cannot easily be typed, such as Tab, which is Chr(9) and carriage return (CR), which is Chr(13). For example, if you read in a text file, x, you can use SplitText(x, Chr(13)) to generate an array of lines from a multiline text file.

TextLength(t)

Returns the number of characters in text «t».

See also TextLength().

:TextLength('supercalifragilisticexpialidocious') → 34

SelectText(t, m, n)

Returns text containing the «m»'th through the «n»'th character of text «t» (where the first character is «m»=1). If «n» is omitted it returns characters from the «m»'th through the end of «t».

See also SelectText().

SelectText('One or two', 1, 3) → 'One'
SelectText('One or two', 8) → 'two'

FindInText(substr, text, start, case Insensitive, re, return, subpattern, repeat, repeatSubpattern, repeatIndex)

Returns the position of the first occurrence of the text «substr» within «text», as the number of characters to the first character of text. If «substr» does not occur in text, it returns 0. FindInText() is case-sensitive unless the optional parameter «caseInsensitive» is true. For example:

Variable People := 'Amy, Betty, Carla'
FindInText('Amy', People) → 1
FindInText('amy', People) → 0
FindInText('amy', People, caseInsensitive: true) → 1
FindInText('Betty', People) → 6
FindInText('Fred', People) → 0

The optional third parameter, «start», specifies the position to start searching at, for example, if you want to find a second occurrence of «substr» after you have found the first one.

FindInText('i','Supercalifragilisticexpialidocious') → 9
FindInText('i','Supercalifragilisticexpialidocious', 10) → 14

Setting the optional parameter, «re», to True causes «substr» to be interpreted as a Perl-compatible regular expression. The optional «return» parameter alters what is returned by FindInText(), according to the possible values:

  • ‘P’ (or ‘Position’): The position of the matching text or subpattern (default)
  • ‘L’ (or ‘Length’): The length of the matching text or subpattern.
  • S’ (or ‘Subpattern’): The text matched the regular expression or subpattern.
  • ‘#’ (or ‘#Subpatterns’): The number of subpatterns in the regular expression.

Parentheses within a regular expression denote subpatterns, numbered in a depth first fashion. Subpatterns can also be named using the regular expression format “(?<name>...)”. You can the match information for any subpattern by specifying the subpattern number or subpattern name in the optional «subpattern» parameter.

FindInText('d.*T', 'FindInText', re:1) → 4
FindInText('d.*T', 'FindInText', re: 1, return:['L', 'S']) → [4, 'dInT']
FindInText('(\d\d\d)-(\d\d\d\d)', '650-212-1212', re:1, return:’S’, subpattern:[0, 1, 2]) → ['212-1212', ’212’, ’1212’]
FindInText('a*(?<bcd>b+(c+)(d*))','zyabdaabbcccfd', re:1, subpattern:['bcd', 0, 1, 2, 3]) → [8, 6, 8, 10, 13]

Normally FindInText() returns information on the first match, but by using any of the three optional repeat parameters can be used to find all matches in text. When multiple matches are returned, the result will be an array and an index is required to index the matches found. When repeat: true is specified, the function creates a local index named .Repeat with elements 1..n to index the result for the n matches found. Alternatively, you can specify your own preexisting index in the «repeatIndex» parameter. If your index contains n elements, then only the first n matches are returned. Finally, you can specify a regular expression subpattern, either numbered or named, in the «repeatSubpattern» parameter. In that case, a local index is created using the matching text for that subpattern as the index labels. When a named (textual) subpattern is specified, the subpattern name is used as a local index name.

The following examples parses XML text, returning an array of ages with a local index named «.Name», where the labels of the local index are the names of each person:

FindInText('<person.*?name = "(?<name>(.*?))".*?>.*?' & '<age>(?<age>.*?)</age>.*?' & '</person>', xmlText, re: true, return: 'S', repeatSubpattern: 'name', subpattern: 'age')
Tip
Consult the Analytica Wiki for more detailed information on using regular expressions. The Wiki contains additional information on regular expression syntax, and far more detail on the more advanced regular expression matching options. In the Analytica Wiki, see pages on Regular Expressions and FindInText.

TextTrim(t, leftOnly, rightOnly, trimChars)

Removes leading and trailing spaces from the text. To remove characters other than spaces, specify the characters to remove in the optional «trimChars» parameter.

See also TextTrim().

TextTrim(' Hello World ') → 'Hello World'
TextTrim(' Hello World ', leftOnly: True) → 'Hello World'
TextTrim(' Hello World ', rightOnly: True) → ' Hello World'
TextTrim(' [One, Two, Three] ', trimChars: ' []') → 'One, Two, Three'

TextReplace(text, pattern, subst, all, caseInsensitive, re)

If all is omitted or False, it returns text with the first occurrence of «pattern» replaced by «subst». If «all» is True, it returns «text» with all occurrences of text «pattern» replaced by «subst». «Pattern» is matched in a case-sensitive fashion unless «caseInsensitive» is True.

See also TextReplace().

TextReplace('StringReplace, StringLength', 'String', 'Text') → 'TextReplace, StringLength'
TextReplace('StringReplace, StringLength', 'String', 'Text', True) → 'TextReplace, TextLength'

When the optional «re» parameter is True, «pattern» is treated as a Perl-compatible regular expression. In this mode, the character sequence \0 in «subst» is replaced by the matching text, and \1, \2, ..., \9 are replaced by the subtext matched by the corresponding numbered subpattern in the regular expression. The character sequence <name> in «subst» is replaced by the subtext matched to the indicated named subpattern.

TextReplace('Hello world', '\w+', '«\0»', all:True, re: True) → '«Hello» «world»'
TextReplace('Hello world', '(.{1, 7}).*', '\1…', re: True) → 'Hello w…'
TextReplace(text: 'swap first and last', pattern: '(?<first>\w+)(?<mid>.*)(?<last>\b\w+)', subst:'<last><mid><first>', re: True) → 'last first and swap’
TextReplace('swap first and last', '(\w)(\w*)(\w)', '\3\2\1', re: 1, all: 1 ) → 'pwas tirsf dna tasl'

Joining Text: a & b

The & operator joins (concatenates) two text values to form a single text value, for example:

'What is the' & ' number' & '?' → 'What is the number?'

If one or both operands are numbers, it converts them to text using the number format of the variable whose definition contains this function call (or the default suffix format if none is set), for example:

'The number is ' & 10^8 → 'The number is 100M'

This is also useful for converting (or “coercing”) numbers to text.

JoinText(a, i, separator, finalSeparator, default, textForNull)

Returns the elements of array «a» joined together into a single text value over index «i». If elements of «a» are numeric, JoinText() first converts them to text using the number format settings for the variable whose definition contains this function call. For example:

I := ['A', 'B', 'C']
JoinText(I, I) → 'ABC'
A := Array(I, ['VW', 'Honda', 'BMW'])
B := Array(I, ['VW', Null, 'BMW'])
JoinText(A, I) → 'VWHondaBMW'

If the optional parameter «separator» is specified, it is inserted as a separator between successive elements, for example:

JoinText(A, I, ', ') → 'VW, Honda, BMW'

The optional parameter «finalSeparator», if present, specifies a different separator between the second-to-last and last elements of «a».

JoinText(A, I, '; ', '; and') → 'VW; Honda; and BMW'

Null values in «a» are ignored unless the optional parameter «textForNull» is specified.

JoinText(B, I, ', ') → 'one, two'
JoinText(B, I, ', ', textForNull: ) → 'one, , two'
JoinText(B, I, ', ' , textForNull: 'NULL') → 'one, NULL, two'

The optional «default» parameter is returned when all values are ignored, or «a» has a zero length.

JoinText([Null, Null, Null], default: Null) → «null»

SplitText(text, separator, caseInsensitive, re)

Returns a list of text values formed by splitting the elements of text value text at each occurrence of separator «separator». For example:

SplitText('VW, Honda, BMW', ', ') → ['VW', 'Honda', 'BMW']

SplitText() is the inverse of JoinText(), if you use the same separators. For example:

Var x := SplitText('Humpty Dumpty sat on a wall.', ' ') → ['Humpty', 'Dumpty', 'sat', 'on', 'a', 'wall.']
JoinText(x, , ' ') → 'Humpty Dumpty sat on a wall.'

When «separator» contains letters, setting «caseInsensitive» to True matches in a lower/uppercase-insensitive manner. When the «re» parameter is True, separator is interpreted as a Perl-compatible regular expression.

Variable s := 'Yes, Virginia. There is a Santa Claus!'
SplitText(s, '[\s, \.!]+', re: 1) → ['Yes', 'Virginia', 'There', 'is', 'a', 'Santa', 'Claus', ]
SplitText(TextTrim(s, trimChars: ' , .!'), '[\s, \.!]+', re:1) → ['Yes', 'Virginia', 'There', 'is', 'a', 'Santa', 'Claus']
Tip
With SplitText(), «text» must be a single text value, not an array. Otherwise, it might generate an array of arrays of different length. See Functions expecting atomic parameters on what to do if you want apply it to an array.

TextLowerCase(t)

TextLowerCase() returns the text «t» with all letters as lowercase. For example:

TextLowerCase('What does XML mean?') → 'what does xml mean?'

TextUpperCase(t)

TextUpperCase() returns the text «t» with all letters as uppercase. For example:

TextUpperCase('What does XML mean?') → 'WHAT DOES XML MEAN?'

TextSentenceCase(Text, preserveUC)

TextSentenceCase() returns the text «t» with the first character (if a letter) as uppercase, and any other letters as lowercase. For example:

TextSentenceCase('mary ann FRED Maylene') → 'Mary ann fred maylene'
TextSentenceCase(SplitText('mary ann FRED Maylene', ' ')) → ['Mary', 'Ann', 'Fred', 'Maylene']
TextSentenceCase('they are Fred and Maylene', true) → 'They are Fred and Maylene'

NumberToText(x, format)

NumberToText() converts a number to text using the format specified in the second parameter. The possible formats are: 'Suffix', 'Exponential', 'Fixed Point', 'Integer', 'Percent', 'Date', 'Boolean', or 'Hexadecimal'. For conciseness, the first letter of each may be used:

NumberToText(3.45M, ['S', 'E', 'F', 'I', 'H']) → ['3.45M', '3.45.e+006', '3450000', '3450000', '0x34a490']
NumberToText(0.0012, ['S', 'E', 'F', 'P']) → ['1.2m', '1.2e-003', '0', 0.12%']

Several optional parameters control aspects of each format. The parameter digits specifies the precision for Suffix and Exponential formats, and the digits to the right of the decimal for Fixed Point and Percent formats.

NumberToText(Pi, 'Suffix', digits: 5) → '3.1416'
NumberToText(Pi, 'Fixed Point', digits: 5) → '3.14159'
NumberToText(Pi, 'Percent', digits: 5) → '314.15927%'

You can specify «showZeros» as true to force the inclusion of trailing zeros.

NumberToText(1/4, 'Percent', digits: 2, showZeros: True) → '25.00%'

To include thousands separators, pass True to the «thousandsSeparators» parameter.

NumberToText(7^12,'I', thousandsSeparators: True) → '13,841, 287, 201'

The «currency» parameter specifies a template that includes the currency symbol and its placement relative to the number and the minus sign.

NumberToText(-372, 'F', currency: ['-£#', 'US$-#', '#£-', '($#)']) → ['-£372', 'US$-372', '372£-', '($372)']

When converting date-time numbers, the «dateFormat» parameter provides the date format template.

NumberToText(Today(), dateFormat: 'yyyy MMMM dd (wwww)') → '2013 July 01 (Monday)'

There are places, such as when exporting data, where you want to make sure that the numbers are printed at full precision. When the «fullPrecision» parameter is true, extra digits are included when necessary to ensure that the full precision of the number is output.

NumberToText(Pi, fullPrecision: True) → '3.141592653589793'

ParseNumber(text, badVal)

Parses a text value into a number. Dates are not parsed by ParseNumber() (use ParseDate() for dates). The result is independent of the number format setting. Values that are already numeric are returned. The optional «badVal» parameter specifies the value returned when text is unparseable, which defaults to Null. The usage ParseNumber(x, x) can be used when «x» is an array and you want to pass unparseable entries through.

ParseNumber('12.43K') → 12.43K
ParseNumber('hello') → «null»
ParseNumber(14.3) → 14.3
Var x := ['3, 214', 14, 'foo'] Do ParseNumber(x, x) → [ 3214, 14, 'foo']

See Also

Comments


You are not allowed to post comments.