# Sets - collections of unique elements

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

In mathematics, a set is a collection of unique elements. A set itself may be treated as a self-contained entity, such that you might represent an array of sets. In such applications, you do not wish for the elements of the set to be treated as an array dimension, or to interact with array abstraction; instead, you want the set to be treated as a single atomic entity, to which operations such as set intersection and set union might be applied. Thus, a simple list representation would not be suitable, since the list itself would act as an implicit dimension, and hence would interact with array abstraction.

Representation of a set: A natural way to represent a set in Analytica is as a reference to a list of elements. References operations are covered in a more general context in References and Data Structures, but in the context of sets, you can simple view the reference operator, \L, as an operator that takes a list of elements, «L», and returns a set, and the dereference operator, #S, as an operator returns a list of elements from a set.

Literal sets: The following syntax is used to write an explicit set of literal elements:

\(['a','b','c','d'])

You cannot omit the round parentheses, i.e., \['a','b','c','d'], because square brackets following a reference operator are used to specify the indexes consumed by the reference.

Creating sets from an array: Given a 2-D array, A, indexed by I and J, the expression \[I]A returns a 1-D array of sets indexed by J. For each element of J, the vector indexed by I becomes a set. At this point, duplicate elements have not been removed, so it isn’t a strict set yet; however, it is now ready to be treated as a set by the set functions described in this section.

Variable Array_s :=
Index_b ▶
Index_a: ▼ 1 2 3
a 7 -3 1
b -4 -1 6
c 5 0 -2
\[Index_a]Array_s →
Index_b ▶
1 2 3
\([7, -4, 5]) \([-3, -1, 0]) \([1, 6, -2])

Display of sets: In a result table, each set displays as «ref». :

Double clicking on «ref» displays the elements of the set in a new result window.

Null values in sets: Set functions, like most other array functions, ignore Null values by default. This makes it possible to pad an array with Null values to hold the elements of a set, when the number of elements is less than the length of the array. However, it also means that set cannot contain the Null value by default. When you want to include Null as an actual element, and not just as a value to be ignored, the various set functions provide an optional boolean parameter to indicate this intent.

## SetContains(set, element)

Example:

SetContains(\(['a', 'b', 'c', 'd']), 5) → 0
SetContains(\Sequence(1, 100, 7), 85 ) → 1

## SetDifference(originalSet, remove1, remove2, ..., resultIndex, keepNull)

Returns the set that results when the elements in the sets «remove1», «remove2», ..., as well as any duplicate elements, are removed from «originalSet». When «resultIndex» is unspecified or false, the result is a reference to a list of elements. When an index is specified in the «resultIndex» parameter, the result is a 1-D array indexed by the «resultIndex». When «keepNull» is unspecified or null, then any Null value in «originalSet» are ignored (and won’t be in the result). When «keepNull» is true, Null is treated as a legitimate element. See also SetDifference().

Example:

Var S := \(0 .. 10);
Var S2 := \Sequence(0, 10, 2);
Var S3 := \Sequence(0, 10, 3);
#SetDifference(S, S2, S3) → [1, 5, 7]
Index I := 1 .. 5;
SetDifference(S, S2, S3, resultIndex: I) →
.I ▶
1 2 3 4 5
1 5 7 «null» «null»

Set difference can be used to remove duplicates from a list:

Var L := [Null, 'a', 'b', 'c', 'd', 'b', 'c', 'd'];
#SetDifference(\L) → ['a', 'b', 'c', 'd']
#SetDifference(\L, keepNull: true) → [«null», 'a', 'b', 'c', 'd']

## SetIntersection(sets, i, resultIndex, keepNull)

Returns the set intersection of «sets». The «sets» parameter should be an array of sets indexed by «i», or «i» can be omitted and «sets» specified as a list of sets. When «resultIndex» is omitted, the result is a set (a reference to a list) containing the elements common to all «sets». Null values are ignored (and not included in the result) unless «keepNull» is specified as true, in which case Null is treated like any other element. See also SetIntersection().

Example:

Var S1 := \(['a', 'b', 'c', null,'d']);
Var S2 := \(['b', 'c', null, 'e']);
SetIntersection([S1, S2]) → \(['b', 'c'])
SetIntersection([S1, S2], keepNull: true) → \(['b', 'c', «null»])
#SetIntersection([\('a' .. 'p'),\('k' .. 'z')]) → ['k', 'l', 'm', 'n', 'o', 'p']

The following example finds all numbers under 10,000 divisible by the first 5 prime numbers:

Index n := [2, 3, 5, 7,11];
Var sets := (for j := n do \Sequence(j, 10K, j));
#SetIntersection(sets, n) → [2310, 4620, 6930, 9240]

## SetsAreEqual(sets, i, ignoreNull)

Tests whether all «sets» in the parameter «sets» have the same elements. The parameter sets should be an array indexed by index «i», or index «i» can be omitted and sets can be a list of sets. Null values are ignored unless «ignoreNull» is specified as false. The presence of duplicates does not impact equality determination. See also SetsAreEqual().

Example:

Var L1 := ['a', 'b', 'c', null];
Var L2 := ['b', 'c', 'a'];
Var L3 := ['c', 'b', 'a', 'b'];
SetsAreEqual([\L1, \L2, \L3]) → 1
SetsAreEqual([\L1, \L2, \L3], ignoreNull: false) → 0

## SetUnion(sets, i, resultIndex, keepNull)

Returns a collection of all elements appearing in any set appearing in the parameter «sets». The parameter sets is an array of sets (references to lists or to 1-D arrays) which is indexed by «i», or it may be a list of sets when «i» is omitted. When «resultIndex» is omitted, the result is a reference the list of elements. When an index is provided to the «resultIndex» parameter, the result is a 1-D array indexed by «resultIndex». Null values are ignored (and not included in the result) unless «keepNull» is specified as true. See also SetUnion().

Example:

#SetUnion([\('a' .. 'd'), \('c' .. 'f')]) → ['a', 'b', 'c', 'd', 'e', 'f']
Index m := Sequence(1-Jan-2011, 1-May-2011, dateUnit: 'M');
Index d := [0, 14];
#SetUnion(SetUnion(\[d](m + d), d), m) →
[1-Jan-2011, 15-Jan-2011, 1-Feb-2011, 15-Feb-2011, 1-Mar-2011, 15-Mar-2011, 1-Apr-2011, 15-Apr-2011, 1-May-2011, 15-May-2011]