Regular Expressions

Regular expressions are a concise and powerful, but cryptic, formalism for identifying patterns of text to match. They can be quite useful for parsing text files that have minor variability in their formats. They play a prominent role in several programming languages, most notably Perl and Python.

Starting with release 4.2, Analytica provides very powerful (Perl-compatible) regular expression processing within several of its built-in text functions, notable FindInText, Split, and TextReplace. Each of these functions takes a pattern, which is interpreted as a regular expression when you also specify an optional parameter: re:True. For example:

{To find the position of a seven-letter word:}

FindInText("\b\w{7}\b","Now is the time for all good men to come to the aid of their country",re:1) → 62

{Split on any word having two repeated letters,}

split("When in the course of human events, it becomes necessary for ...","[^\w]*\b\w*(\w)\w*\1\w*\b[^\w]*",re:1)→

["When in the course of human", "it", "", "for ..."]