FindInText


FindInText(substr, text, start, caseInsensitive)

Returns the position of the first occurrence of «substr» in «text». If «substr» does not occur in «text», returns 0. The optional third parameter, «start», specifies the position to start searching at. You can specify the fourth parameter, «caseInsensitive», as True to signify that upper and lower variants of the same characters match.

Library

Text Functions

Examples

FindInText("is", "Now is not the time") → 5
FindInText("i", "Now is not the time") → 5
FindInText("i", "Now is not the time", "i") → 17
FindInText("now", "Now is not the time") → 0
FindInText("no", "Now is not the time") → 8
FindInText("no", "Now is not the time", caseInsensitive: True) → 1

Optional parameters

Re

When the optional parameter «re» is set to True, «substr» is interpreted as a regular expression. For example, the following find the location of the first vowel:

FindInText("[aeiou]", "Now is not the time", re: True, caseInsensitive: True) → 2

Return

When matching a pattern, you may want information other than just the starting position. The optional parameter «return» specifies what information about the match is desired. «Return» can be specified as any of the following (or an array of any of these):

  • 'P' (or 'Position'): The position in the subject «text» where the matched pattern was found, or zero if not found.
  • 'L' (or 'Length'): The length of the match in the subject «text».
  • 'S' (or 'SubPattern'): The subtext matched by the pattern
  • '#' (or '#SubPatterns'): The number of subpatterns in the regular expression.

Examples:

FindInText("[aeiou]", "Now is the time for", re: True, return: 'S') → 'o'
FindInText("\w{4}", "Now is the time for", re: True, return: 'S') → 'time'
FindInText("\w{4}\w*", "We the people, in order to...", re: True, return: ['S', 'L', 'P']) → ['people', 6, 8]

Subpattern

A regular expression may contain subpatterns. These are delineated within the regular expression through the use of parenthesis, and are numbered in a depth-first fashion. They can also be used using the syntax (?<name>...). You can return information on a specific subpattern (or an array of subpatterns) by specifying which subpattern is of interest in the optional «subpattern» parameter:

FindInText("to (\w+)","We the people, in order to form a more...", re: True, subpattern: 1) → 28
FindInText("to (\w+)","We the people, in order to form a more...", re: True, subpattern: 1, return: 'S') → 'form'
FindInText("to (?<verb>\w+)","We the people, in order to form a more...", re: True, subpattern: 'verb') →28

For more details, see Regular Expressions.

RepeatIndex

Suppose you have a data file, and you've created a regular expression that matches to any single "record" in the data file. Using some optional parameters, you can instruct FindInText to retrieve all matches (thus all records) within the text. The result will be an array, with an index corresponding to the repeated match. The first element of that index will correspond to the first match in the text, the second element to the second match, etc. Any of the other options to FindInText can be utilized.

You can create the index prior to calling FindInText, in which case the length of your index determines the maximum number of records that will be matched. The index is provided using the optional «repeatIndex» parameter as in this example:

Index SpacePos := 1..5;
FindInText(' ', 'one two three a b c d e f', repeatIndex: SpacePos) →
Findintext result3.jpg

If the number of matches is fewer than the number of elements in the index, the remaining cells are padded with «null» values. Notice that this works with plain text or regular expression patterns.

Repeat

If you can't identify an upper limit on the number of matches in advance, then FindInText can determine the required length and create a local index for you. To indicate that you want all repeated matches (but don't have an index), specify the optional parameter «repeat» as true:

FindInText(' ', 'one two three a b c d e f', repeat: true) →
Findintext result2.jpg

In this case, FindInText names the local index Repeat and sets its labels from 1..N.

RepeatSubpattern

FindInText can optionally retrieve the index label from a subpattern within a regular expression. The optional parameter «repeatSubpattern» specifies the name of the subpattern to use, which can be identified either by its subpattern number or subpattern name.

Var txt := "Age: 34, Weight:153 lb, Height: 56in";
Var pat := "(\w+):(\d+)\s*(\w+)?";
Index parts := ['amt', 'unit'];
FindInText(pat, txt, repeatSubpattern: 1, return: "S", subpattern: 1 + @parts) →
Findintext result1.jpg

When the «indexSubpattern» is a named subpattern, the name of the subpattern is used as the local index name:

Var txt := "Age: 34, Weight:153 lb, Height: 56in";
Var pat := "(?<att>\w+):(?<amt>\d+)\s*(?<unit>\w+)?";
Index parts := ['amt', 'unit'];
FindInText(pat, txt, repeatSubpattern: "att", return: "S", subpattern: parts) →
Findintext result4.jpg

See Also

Comments


You are not allowed to post comments.