Revision as of 01:39, 20 May 2015

Sets

In mathematics, a set is a collection of unique elements. Analytica represents a set as a reference to a list or 1-D array, for example:

Var A_list := ['a','b','c','d'];
Var A_set := \ A_list;

The backslash, "\", in front of A_list returns a Reference to the list. The Set Functions described below work on Sets represented like this.

This representation means a set is an atomic element (just a Reference), which allows Set functions to fully array abstract. They wouldn't be able to array abstract if sets were simply lists, for example if operating on multiple lists with different lengths or different indexes.

The following demonstrates this representation:

Suppose you have a 4-dimensional array, A, indexed by In1, In2, In3 and In4. The expression \[In4]A returns a 3-D array of sets, each set being one of the vectors indexed by In4. As seen, when using the reference operator, \ you can specify in brackets which index becomes the set dimension.

Literal Sets

To create a set from a literal list, you must enclose it in parentheses preceded by backslash "\":

either specify the Null dimension to the reference operator, or you must surround the brackets by parentheses:

\([1,2,3])

It does not work if you omit the parentheses due to a syntactic ambiguity:

\[1,2,3]    { **** Does not work **** }

A more obscure approach to create a Set from a literal list is to specify a Null index after the backslash "\":

\[Null][1,2,3]

The Empty Set

You can specify the Empty set (that contains nothing) simply as:

\([])      { The empty set }

The expression Null does not have the same meaning as the empty set. Just as array functions Sum, Max, etc. ignore Null elements, so do set functions SetIntersection, SetUnion, etc. Thus, the intersection of a set S with Null is therefore the set S, not the empty set.

Converting a Set to a List

You can obtain a list from a Set using the dereference operator, #.

  # A_set  → ['a','b','c','d']

This operation does not array-abstract: You can apply it to a single set, but not to an array of sets.

Function SetContains

Function SetContains(s, element)

Returns true if element is contained in the set s.

SetContains(\(1 .. 10), [9, 10, 11] ) → [1, 1, 0]

Note that [9, 10, 11] is a list of potential set elements (not itself a set).

Function SetsAreEqual

Function SetsAreEqual(sets, I, ignoreNull)

Returns true if the first parameter is a list of sets that have exactly the same elements, ignoring duplicates or ordering (and ignoring Null values, unless optional parameter «ignoreNull» is specified as False).

Var L1 := [1, 1, 1, 2, 3];
Var L2 := [3, 2, 2, 1];
Var L3 := [2, 3, 1, Null];
SetsAreEqual([\L1, \L2, \L3]) → 1

In this example, all three sets are treated as the set {1, 2, 3}. But:

SetsAreEqual([\L1, \L2, \L3], IgnoreNull: False) → 0

With IgnoreNull set to False, the set \L3 includes the Null value, and so is not identical to sets \L1 and \L2.

If Table T is indexed by Row and Col, this expression tests if each Row contains the same items (ignoring ordering or repeated items):

SetsAreEqual(\[Col]T, Row)

The first parameter specifies that each Col-vector (i.e., each row) is taken as a set. The index parameter, Row, specifies that the comparison takes place along the Row index of T.

Function SetIntersection

Function SetIntersection(sets, I, resultIndex, keepNulls)

Returns the set of elements in common to all the sets specified in the first parameter, «sets».

The first parameter is a list or array of sets. When this is or might be a multi-dimensional array, the optional second parameter, «I» specifies the index to operate over.

Consider the following array:

A :=	J →
I ↓	3	6	9	2	5	8
	7	4	1	8	5	2
	4	8	2	6	0	3

To intersect the rows of A, finding all elements in common to all rows, then treat each J-vector as a set (\[J]A) and operate over the I index as follows:

#SetIntersection( \[J]A, I ) → [2,8]

Optionally you can map the result onto a pre-existing index. When a «resultIndex» is provided, an array is returned on that index containing the resulting elements. When «resultIndex» is not provided, the result is a set (a reference to a list). When «resultIndex» is too short to accomodate all elements in the result, only the first Size(resultIndex) elements of the result are returned. When it is too short, the final cells are padded with Null.

SetIntersection( \[J]A, I, resultIndex: J )

→

J →
2	8	«null»	«null»	«null»	«null»

To find the set of elements that two indexes have in common, use:

#SetIntersection( [\I,\J] )

Function SetUnion

Function SetUnion(sets, I, resultIndex, keepNulls)

Returns the set of all unique non-Null elements occurring in any of the sets passed in the first parameter, «sets».

The first parameter is an array of sets. To find the union of the elements in set of lists or indexes, use:

#SetUnion( [\L1,\L2,\L3,\L4] )

Since the result is a set, i.e., a reference to a list, the dereference operator, # is applied to the result. That de-referenced result can then be used to define a new index.

To find the union of all unique elements occurring along the rows of a 2-dimensional array indexed by I and J, use:

#SetUnion( \A[J], I )

This turns each row (each row being a slice along I indexed by J) into a set, resulting in a 1-D array of sets indexed by I, and then applies the union operation along the I dimension.

To include Null values in the result, specify the optional parameter «keepNulls» as true.

Var L1 := [1,3,Null,5];
Var L2 := [1,2,4,5];
SetUnion( [\L1,\L2] ) → \[1,3,5,2,4]
SetUnion( [\L1,\L2], keepNulls:true ) → \[1,3,Null,5,2,4]

If you specify the optional «resultIndex», the result is returned as an array (rather than as a reference) along the indicated index. The number of elements in the result is truncated to the index's length, or padded with Null values if the index is longer than necessary.

Function SetDifference

Function SetDifference( originalSet, remove, remove2, remove3, ..., resultIndex, keepNull)

Returns the unique set of non-Null elements in «originalSet» that do not appear in any of the other sets, «remove».... The result is a set (reference to a list) unless «resultIndex» is specified. If «resultIndex» is specified, then an array indexed by the result index is returned, truncated to the length of «resultIndex» or padded with Null values.

SetDifference(\Sequence(1,10),\Sequence(2,10,2),\Sequence(3,10,3)) → \[1,5,7]

You can also remove individual elements:

SetDifference(\Sequence(1,4),3) → \[1,2,4]

Null values are not included in the result unless «keepNull» is specified as true.

Miscellaneous Usage Notes

When provided with only a single set, SetUnion(s), SetIntersection(s) and SetDifference(s) have the effect of removing duplicate (and Null) values. Hence, when A is an array without any Null values, these are all equivalent:

A[I=Unique(A,I)]

#SetUnion([\A])

#SetIntersection([\A])

#SetDifference(\A)

When using an index, I, SetContains(\I,x) and @[I=x]>0 are essentially equivalent. They are exactly equivalent when the IndexValue(I) does not contain any handles, or is a meta-only index (or MetaIndex).

History

These set functions were New to Analytica 4.3

@@ Line 1: / Line 1: @@
 [[Category:Set Functions]]
-''[[New to Analytica 4.3]]''
 = Sets =
-In mathematics, a ''set'' is a collection of non-repeated elements.  The functions described on this page operate on sets that are represented in Analytica as a [[Using References|reference]] to list or 1-D array.  With this representation, a set is seen as an atomic element by Analytica's array abstraction, thus allowing all these functions to fully array abstract even when the collections comprising the sets are of different lengths or have different indexes.
+In mathematics, a ''set'' is a collection of unique elements.  Analytica represents a set as a [[Using References|reference]] to a list or 1-D array, for example:
+ [[Var]] A_list := ['a','b','c','d'];
+ [[Var]] A_set := \ A_list;
+The backslash, "\", in front of ''A_list'' returns a [[Using References|Reference]] to the list.  The Set Functions described below work on Sets represented like this.
+This representation means a set is an atomic element (just a Reference), which allows Set functions to fully array abstract. They wouldn't be able to array abstract if sets were simply lists, for example if operating on multiple lists with different lengths or different indexes.
 The following demonstrates this representation:
- [[Var]] A_list := ['a','b','c','d'];
- [[Var]] A_set := [[Using References|\]]A_list;
- ...
-The backslash in front of ''A_list'' turns the list into a ''set'' in the manner expected by functions here.
 Suppose you have a 4-dimensional array, A, indexed by In1, In2, In3 and In4.  The expression <code>\[In4]A</code> returns a 3-D array of sets, each set being one of the vectors indexed by In4.  As seen, when using the [[Using References|reference operator, \]] you can specify in brackets which index becomes the ''set dimension''.
@@ Line 18: / Line 18: @@
 == Literal Sets ==
-To create a set from a literal list, you must either specify the ''Null dimension'' to the [[Using References|reference operator]], or you must surround the brackets by parentheses.  You cannot simply place a backslash in front of a literal list, since the [[Using References|backslash operator]] sees brackets and assumes that the brackets are specifying the indexes to swallow.  Here are two examples of how to express a literal set:
+To create a set from a literal list, you must enclose it in parentheses preceded by backslash "\":
- [[Using References|\]][ [[Null]] ][1,2,3]
+either specify the ''Null dimension'' to the [[Using References|reference operator]], or you must surround the brackets by parentheses:
   [[Using References|\]]([1,2,3])
-but
+It does not work if you omit the parentheses due to a syntactic ambiguity:
   [[Using References|\]][1,2,3]    { **** Does not work **** }
+A more obscure approach to create a Set from a literal list is to specify a Null index after the backslash "\":
+ \[Null][1,2,3]
 == The Empty Set ==
-When using the set functions described below, [[Null]] does not have the same meaning as the empty set.  Just as other array functions (e.g., [[Sum]], [[Max]], etc.) ignore [[Null]] elements, so do functions '''SetIntersection''', '''SetUnion''', etc.  The intersection of a set with [[Null]] is therefore the set itself, not the empty set.  The empty set is specified as:
+You can specify the Empty set (that contains nothing) simply as:
+  \([])      { The empty set }
-  \([])      { The empty set }
+The expression [[Null]] does not have the same meaning as the empty set.  Just as array functions [[Sum]], [[Max]], etc. ignore [[Null]] elements, so do set functions '''SetIntersection''', '''SetUnion''', etc.  Thus, the intersection of a set S with [[Null]] is therefore the set S, not the empty set.
 == Converting a Set to a List ==
-The [[Using References|dereference operator, #]] is used to convert a set back into a list.  This operation does not array-abstract, so you can apply it to a single set, but not to an array of sets.
+You can obtain a list from a Set using the [[Using References|dereference operator, #]].
+   # A_set  &rarr; ['a','b','c','d']
+This operation does not array-abstract: You can apply it to a single set, but not to an array of sets.
 = Function SetContains =
-  Function SetContains( set, element )
+  Function SetContains(s, element)
-Returns true if element is contained in the set.
+Returns true if element is contained in the set s.
-  ''SetContains''( \[[Sequence]](7,1000,7), [770,775,777] ) &rarr; [1,0,1]
+  ''SetContains''(\(1 .. 10), [9, 10, 11] ) &rarr; [1, 1, 0]
+Note that [9, 10, 11] is a list of potential set elements (not itself a set).
 = Function SetsAreEqual =
-  Function SetsAreEqual( sets'', I, ignoreNull'' )
+  Function SetsAreEqual(sets'', I, ignoreNull'')
-Returns true when all the sets passed into the first parameter have exactly the same elements, without regard to duplicates or ordering, and ignoring [[Null]] values (unless «ignoreNull» is explicitly specified to be false).
+Returns true if the first parameter is a list of sets that have exactly the same elements, ignoring duplicates or ordering (and ignoring [[Null]] values, unless optional parameter «ignoreNull» is specified as False).
 <code>
-:Var L1 := [1,1,1,2,3];
+:Var L1 := [1, 1, 1, 2, 3];
-:Var L2 := [3,2,2,1];
+:Var L2 := [3, 2, 2, 1];
-:Var L3 := [2,3,1,Null];
+:Var L3 := [2, 3, 1, Null];
-:'''SetsAreEqual'''( [\L1,\L2,\L3] ) &rarr; 1
+:'''SetsAreEqual'''([\L1, \L2, \L3]) &rarr; 1
 </code>
-In this example, all three sets are treated as the set {1,2,3}.  But:
+In this example, all three sets are treated as the set {1, 2, 3}.  But:
 <code>
-:'''SetsAreEqual'''( [\L1,\L2,\L3], ignoreNull:false ) &rarr; 0
+:'''SetsAreEqual'''([\L1, \L2, \L3], IgnoreNull: False) &rarr; 0
 </code>
+With IgnoreNull set to False, the set \L3 includes the [[Null]] value, and so is not identical to sets \L1 and \L2.
-With the optional parameter, the set \L3 is then understood to include the [[Null]] value.
+If Table T is indexed by Row and Col, this expression tests if each Row contains the same items (ignoring ordering or repeated items):
-The following tests whether every row of a table contains the same set of items (ignoring ordering), where T is indexed by Row and Col:
 <code>
-:SetsAreEqual( \[Col]T, Row )
+:SetsAreEqual(\[Col]T, Row)
 </code>
-The first parameter specifies that each ''Col''-vector (i.e., each row) is taken as a set.  The index parameter, ''Row'', specifie that the comparison takes place along the ''Row'' index of ''T''.
+The first parameter specifies that each ''Col''-vector (i.e., each row) is taken as a set.  The index parameter, ''Row'', specifies that the comparison takes place along the ''Row'' index of ''T''.
 = Function SetIntersection =
-  Function SetIntersection( sets'', I, resultIndex, keepNulls'' )
+  Function SetIntersection(sets'', I, resultIndex, keepNulls'')
 Returns the set of elements in common to all the sets specified in the first parameter, «sets».
-The first parameter is a list or array of sets.  When this is or might be a multi-dimensional array, then the second parameter, «I» specifies the index to operate over.
+The first parameter is a list or array of sets.  When this is or might be a multi-dimensional array, the optional second parameter, «I» specifies the index to operate over.
 Consider the following array:
@@ Line 120: / Line 128: @@
 = Function SetUnion =
-  Function SetUnion( sets'', I, resultIndex, keepNulls'' )
+  Function SetUnion(sets'', I, resultIndex, keepNulls'')
 Returns the set of all unique non-[[Null]] elements occurring in any of the sets passed in the first parameter, «sets».
@@ Line 179: / Line 187: @@
 * When using an index, ''I'', [[SetContains]](\I,x) and [[Index_Position_Operator::@|@[I=x]>0]] are essentially equivalent.  They are exactly equivalent when the [[IndexValue]](I) does not contain any handles, or is a meta-only index (or [[MetaIndex]]).
-= See Also =
+== See Also ==
 * [[Using References]]
@@ Line 185: / Line 193: @@
 * [[Unique]]
 * [[Sort]], [[SortIndex]]
+== History ==
+These set functions were ''[[New to Analytica 4.3]]''

Difference between revisions of "Set Functions"