Set Functions

Analytica represents a set as a reference to a list or 1-D array. It offers all the usual set operations, including #Setunion, #SetIntersection, #SetDifference, and #SetContains.

Introducing sets

In mathematics, a set is an unordered collection of unique elements. Analytica represents a set as a reference to a list or 1-D array, for example:

Local A_list := ['a', 'b', 'c', 'd'];

Local A_set := \ A_list;

The backslash, \, in front of A_list returns a Reference to the list. All the Set Functions described below work on Sets represented like this. Using a Reference to a list allows Set functions to fully array abstract. If they operated directly on lists rather than references to lists, they wouldn't work properly when operating on multiple lists with different lengths or different indexes.

Literal Sets

To create a set from a literal list, you must precede it by backslash, \, to create a reference, and enclose it in parentheses :

\([1, 2, 3])

If you omit the parentheses, it gives an error due to a syntactic ambiguity:

\[1, 2, 3] { **** Does not work **** }

Although a list is ordered and may contain duplicates, the set operations ignore the order and any duplicates. Any Set result contains only unique elements. They also ignore Null elements, unless you specify optional parameter keepNulls: True.

The Empty Set

You can specify the Empty set (that contains nothing) simply as:

\([]) { The empty set }

The expression Null is not the same as the empty set. Set functions SetIntersection, SetUnion, etc. ignore Null elements, just as array functions Sum, Max, etc. ignore them. Thus, the intersection of a set S with Null is therefore the set S, not the empty set.

Convert a Set to a List

You can obtain a list from a Set using the dereference operator, #.

# A_set → ['a', 'b', 'c', 'd']

This operation does not array-abstract: You can apply it to a single set, but not to an array of sets.

Summary of Set functions

Most of the set functions take a list or array of one or more sets as their primary function.

SetUnion(sets): Returns a set that is the union of the sets -- that is, all elements that occur in one or more of the sets.

SetIntersection(sets): Returns a set that is the intersection of the sets -- that is, all elements that occur in one or more of the sets.

#SetIntersection, #SetDifference, and #SetContains.

Function SetIntersection

SetIntersection(sets, I, resultIndex, keepNulls)

Returns the set of elements in common to all the sets specified in the first parameter, «sets».

The first parameter is a list or array of sets. If this is or might be a multi-dimensional array, you should specify the optional second parameter, «I» as the index to operate over.

Consider the following array:

A :=

	J ▶
I ▼	3	6	9	2	5	8
	7	4	1	8	5	2
	4	8	2	6	0	3

To intersect the rows of A -- i.e. find the elements common to all rows -- treat each J-vector as a set (\[J]A) and operate over the I index:

#SetIntersection(\[J]A, I) → [2, 8]

Optionally you can map the result onto a pre-existing index. When you provide a «resultIndex», it returns an array over that index containing the resulting elements. When «resultIndex» is not provided, the result is a set (a reference to a list). If «resultIndex» is too short to accommodate all elements in the result, it includes only the first Size(resultIndex) elements of the result. If it's too short, it pads the final cells with Null.

SetIntersection(\[J]A, I, resultIndex: J) →

J →
2	8	«null»	«null»	«null»	«null»

To find the set of elements that two indexes have in common, use:

#SetIntersection([\I, \J])

Function SetUnion

SetUnion(sets, I, resultIndex, keepNulls)

Returns the set of all unique non-Null elements occurring in any of the sets passed in the first parameter, «sets».

The first parameter is an array of sets. To find the union of the elements in set of lists or indexes, use:

#SetUnion([\L1, \L2, \L3, \L4])

Since the result is a set, i.e., a reference to a list, the dereference operator, # is applied to the result. That de-referenced result can then be used to define a new index.

To find the union of all unique elements occurring along the rows of a 2-dimensional array indexed by I and J, use:

#SetUnion(\A[J], I)

This turns each row (each row being a slice along I indexed by J) into a set, resulting in a 1-D array of sets indexed by I, and then applies the union operation along the I dimension.

To include Null values in the result, specify the optional parameter «keepNulls» as true.

Local L1 := [1, 3, Null, 5];

Local L2 := [1, 2, 4, 5];

SetUnion( [\L1, \L2] ) → \[1, 3, 5, 2, 4]

SetUnion( [\L1, \L2], keepNulls: true) → \[1, 3, Null, 5, 2, 4]

If you specify the optional «resultIndex», the result is returned as an array (rather than as a reference) along the indicated index. The number of elements in the result is truncated to the index's length, or padded with Null values if the index is longer than necessary.

Function SetDifference

SetDifference(originalSet, remove, remove2, remove3, ..., resultIndex, keepNull)

Returns the unique set of non-Null elements in «originalSet» that do not appear in any of the other sets, «remove».... The result is a set (reference to a list) unless «resultIndex» is specified. If «resultIndex» is specified, then an array indexed by the result index is returned, truncated to the length of «resultIndex» or padded with Null values.

SetDifference(\Sequence(1, 10), \Sequence(2, 10, 2), \Sequence(3, 10, 3)) → \[1, 5, 7]

You can also remove individual elements:

SetDifference(\Sequence(1, 4), 3) → \[1, 2, 4]

Null values are not included in the result unless «keepNull» is specified as true.

Usage

When provided with only a single set, SetUnion(s), SetIntersection(s) and SetDifference(s) have the effect of removing duplicate (and Null) values. Hence, when A is an array without any Null values, these are all equivalent:

A[I = Unique(A, I)]

#SetUnion([\A])

#SetIntersection([\A])

#SetDifference(\A)

When using an index, I, then SetContains(\I, x) and @[I = x] > 0 are essentially equivalent. They are exactly equivalent when the IndexValue(I) does not contain any handles, or is a MetaOnly index (or MetaIndex).

Function SetContains

SetContains(s, element)

Returns true if element is contained in the set «s». Unlike other Set functions, the result is not a Set. It is a simple Truth value -- or an array of Booleans if the second parameter, element, is an array, for example:

SetContains(\(1 .. 10), [9, 10, 11]) → [1, 1, 0]

Note that the second parameter [9, 10, 11] is a list of potential set elements, not itself a set.

Function SetsAreEqual

SetsAreEqual(sets, I, ignoreNull)

Returns true if the first parameter is a list of sets that have exactly the same elements, ignoring duplicates or ordering (and ignoring Null values, unless optional parameter «ignoreNull» is specified as False).

Local L1 := [1, 1, 1, 2, 3];

Local L2 := [3, 2, 2, 1];

Local L3 := [2, 3, 1, Null];

SetsAreEqual([\L1, \L2, \L3]) → 1

In this example, all three sets are treated as the set {1, 2, 3}. But:

SetsAreEqual([\L1, \L2, \L3], IgnoreNull: False) → 0

With «ignoreNull» set to False, the set \L3 includes the Null value, and so is not identical to sets \L1 and \L2.

If Table T is indexed by Row and Col, this expression tests if each Row contains the same items (ignoring ordering or repeated items):

SetsAreEqual(\[Col]T, Row)

The first parameter specifies that each Col-vector (i.e., each row) is taken as a set. The index parameter, Row, specifies that the comparison takes place along the Row index of T.

History

The set functions were introduced in Analytica 4.3.