Difference between revisions of "ParseJSON"

m
m
 
Line 74: Line 74:
  
 
== Parsing with a schema ==
 
== Parsing with a schema ==
A schema describes the data structure of a Java Script object, and specifies the index in your model that encodes the object.
+
A schema describes the data structure of a Java Script object, and the indexes in your model you want  to map to.
  
 
=== JSON object schemas ===
 
=== JSON object schemas ===
The class structure for a JavaScript object is described an a 1-D array, where the index contains the member names, and the cell values describe the nested structure. For example, consider the earlier <code>json2</code> data:
+
The class structure for a JavaScript object is described a 1-D array, where the index contains the member ("field") names, and the cell values may include a nested structure. For example, consider this <code>json2</code> data again:
  
 
     '{ "title" : "1984",
 
     '{ "title" : "1984",
Line 86: Line 86:
 
       }'
 
       }'
  
This has a top-level JavaScript object (Book) and a nested JavaScript object (PersonName). We can encode these two «objectSchema» as follows.
+
This JavaScript has a top-level object (Book) and a nested object (PersonName). We can encode the «objectSchema» thus:
  
 
:Index Book := <code>['title', 'author', 'year', 'pages', 'paperback']</code>
 
:Index Book := <code>['title', 'author', 'year', 'pages', 'paperback']</code>
Line 94: Line 94:
  
 
:Variable parse4 := <code>[[ParseJSON]](json2, BookSchema, NameSchema )</code>
 
:Variable parse4 := <code>[[ParseJSON]](json2, BookSchema, NameSchema )</code>
:<code>parse4</code> &rarr; [[image:ParseJSON_withSchema1.png]]
+
:<code>parse4</code> &rarr;  
:<code>#parse4[Book='author']</code> &rarr; [[image:ParseJSON_withSchema2.png]]
+
::[[image:ParseJSON_withSchema1.png]]
 +
:<code>#parse4[Book='author']</code> &rarr;
 +
::[[image:ParseJSON_withSchema2.png]]
  
Notice that the existing indexes (<code>Book</code> and <code>PersonName</code>) are used, rather than local indexes created by the function.  
+
The result uses the indexes, <code>Book</code> and <code>PersonName</code>, rather than the local indexes that you get if you don't provide a schema parameter.  
  
When specifying multiple object schema, the first «objectSchema» listed must be the top-level object.  
+
When specifying a schema with multiple objects, you must give top-level «objectSchema» first.  
  
The labels in the index must match the JSON object's member names exactly (case-sensitive), but the ordering of your index labels does not need to match the order in which the member values appear in the JSON. You can include extra member names in your index, but there every member that appears in the «json» data must appear in your index or an error will issue.
+
The labels in the index must match the JSON object's member names exactly. It is case-sensitive. But the order of your index labels does not need to match the order in the JSON. You can include extra labels ("fields names") in your index. But it must contain a label matching every member used in the «json» data -- or it will give an error.
  
 
=== Member schema options ===
 
=== Member schema options ===
Line 111: Line 113:
  
 
=== Reading arrays with schema ===
 
=== Reading arrays with schema ===
The JSON standard expects the outermost object to be an object, so parsing with a schema always starts with the first «objectSchema». When a member contains an array, then the member schema should be a [[reference]] to a [[ListOfHandles|list of index handles]]. The indexes in that list specify the indexes for the resulting array.
+
 
 +
The JSON standard expects the outermost object to be an object, so the schema always starts with the first «objectSchema». When a member contains an array, its schema should be a [[reference]] to a [[ListOfHandles|list of index handles]] containing the indexes for the result.
  
 
:Variable json3 := <code>'{ "data" : [ [ 1,2], [3,4], [5,6] ] }'</code>
 
:Variable json3 := <code>'{ "data" : [ [ 1,2], [3,4], [5,6] ] }'</code>
Line 119: Line 122:
 
:Variable D_Schema := <code>[[Table]](D)( \[[ListOfHandles]]( J,K ) )</code>
 
:Variable D_Schema := <code>[[Table]](D)( \[[ListOfHandles]]( J,K ) )</code>
  
:<code>[[ParseJSON]]( json3, D_Schema )</code> &rarr; [[image:ParseJSON_withSchema3a.png]]
+
:<code>[[ParseJSON]]( json3, D_Schema )</code> &rarr;  
:<code>#[[ParseJSON]]( json3, D_Schema )[D='data]</code>&rarr; [[image:ParseJSON_withSchema3b.png]]
+
::[[image:ParseJSON_withSchema3a.png]]
 +
:<code>#[[ParseJSON]]( json3, D_Schema )[D='data]</code>&rarr;  
 +
::[[image:ParseJSON_withSchema3b.png]]
  
In the next example, the JSON contains an array of books, so that each item in the JSON-array is an object.
+
In this example, the JSON contains an array of books, so each item in the JSON-array is a book object.
  
 
  Variable json5
 
  Variable json5
Line 148: Line 153:
 
:Index Book_Num := <code>1..2</code>
 
:Index Book_Num := <code>1..2</code>
 
:Variable parse5 := <code>[[ParseJSON]]( json5, BiblioSchema, BookSchema, NameSchema )</code>
 
:Variable parse5 := <code>[[ParseJSON]]( json5, BiblioSchema, BookSchema, NameSchema )</code>
:<code>#parse5[Biblio='bibliography']</code> &rarr; [[image:ParseJSON_withSchema5.png]]
+
:<code>#parse5[Biblio='bibliography']</code> &rarr;
 +
::[[image:ParseJSON_withSchema5.png]]
  
In this example, had to know the number of books in advance. This is a limitation, in that your indexes that appear in the schema must have enough length to accommodate the data. It is acceptable to specify more entries than actually exist in the data, for example:
+
In this example, we had to know the number of books in advance. Your schema index(es)must be long enough to accommodate the data, or it will lose any data that doesn't fit. The index can be longer than needed, for example,
 
:Index Book_Num := <code>1..1K</code>
 
:Index Book_Num := <code>1..1K</code>
which has space for one thousand books, even though only two appear in the «json». In this case, the excess slices along Book_Num contain [[Null]]. When you cannot guarantee an index that is guaranteed to be long enough, then you will need to use schema-free parsing for that member (put a [[Null]] in that member's schema).
+
has space for one thousand books, even though only two appear in the «json». The excess slices along Book_Num contain [[Null]]. If you can't be sure that an index will be long enough, you should use schema-free parsing for that member -- i.e. put a [[Null]] in that member's schema.
  
 
== See Also ==
 
== See Also ==

Latest revision as of 23:48, 6 September 2017


New to Analytica 5.0

This function requires the Analytica Enterprise edition or higher (e.g., Analytica Optimizer, ADE or CubePlan).

ParseJSON( json, objectSchema..., flags )

Parses JSON text in «json» to generate corresponding Analytica data and arrays. Usually, you will obtain the «json» text from a call to ReadTextFile or ReadFromUrl. It works without an «objectSchema». But you can specify an «objectSchema» to help map the structure of the data using indexes in your model.

JavaScript Object Notation (JSON) is a widely used lightweight data-interchange format. It is easy for humans and machines to read and write.

Parameters

  • «json»: A JSON-formatted text to parse.
  • «objectSchema»: A schema describing the JSON class structure and mapping it into Analytica index(es). See Parsing with a schema. If given no «objectSchema», see Parsing without a schema below.
  • «flags»: (optional) A bit field of flags that control various aspects of parsing. Bit settings are
    • 1 = During schema-free parsing, create local indexes .Dim1, .Dim2, ... for arrays. Without this, each level of an array is returned as a reference to a list.

Parsing without a schema

The top-level item in a JSON text document should be a JSON object, for example:

Variable json1 
Definition: 
    '{ "title" : "1984",
       "author" : "George Orwell",
       "year" : 1949,
       "pages" : 336,
       "paperback" : true
     }'

Here is the result of parsing that JSON text without providing a schema:

ParseJSON(json1)
ParseJSON1.png

The result has a local index named .Member identifying the fields of the object. The value of the 'title' is text because it was enclosed in quotes "1984". The value in 'year' is a number, because it was not enclosed in quotes:

Variable parse1a := ParseJSON(json1)
TypeOf(parse1a[.Member='title']) → 'Text'
TypeOf(parse1a[.Member='year']) → 'Number'

JSON with nested objects and no schema

With a JSON document containing nested objects, it creates local indexes at each level. To prevent the indexes from combining into a rectangular array, it places each member object in a reference.

Variable json2 
Definition: 
    '{ "title" : "1984",
       "author" : { "first" : "George", "last" : "Orwell" },
       "year" : 1949,
       "pages" : 336,
       "paperback" : true
     }'
Variable parse2 := ParseJSON(json2)
parse2
ParseJSON2.png
#parse2[.Member='author']
ParseJSON2b.png

Reading JSON arrays

With no schema, ParseJSON does not map array data to existing indexes that you might have. There are two ways to read that depending on whether you set the «flags» parameter. By default, if you don't specify «flags», it returns arrays as lists and any nested array data to references to lists (to avoid having the implicit dimensions combine with other indexes).

Variable json3 := '{ "data": [ [ 1,2], [3,4], [5,6] ] }'
Variable parse3 := ParseJSON(json3)
#parse3[.Member='data']
Parse3.png
#Slice(#parse3[.Member='data'],3)
ParseJSON3b.png

If you set used «flags» to 1, ParseJSON creates local indexes, named .Dim1, .Dim2, etc., for each nesting level in «json», and produces a multi-dimensional array without nesting.

Variable parse3b := ParseJSON(json3, flags:1)
#parse3[.Member='data']
ParseJSON3a.png

Parsing with a schema

A schema describes the data structure of a Java Script object, and the indexes in your model you want to map to.

JSON object schemas

The class structure for a JavaScript object is described a 1-D array, where the index contains the member ("field") names, and the cell values may include a nested structure. For example, consider this json2 data again:

    '{ "title" : "1984",
       "author" : { "first" : "George", "last" : "Orwell" },
       "year" : 1949,
       "pages" : 336,
       "paperback" : true
     }'

This JavaScript has a top-level object (Book) and a nested object (PersonName). We can encode the «objectSchema» thus:

Index Book := ['title', 'author', 'year', 'pages', 'paperback']
Index PersonName := ['first', 'last']
Variable BookSchema := Table(Book)('atom',Handle(PersonName),'atom','atom','atom')
Variable NameSchema := Array(PersonName,'atom')
Variable parse4 := ParseJSON(json2, BookSchema, NameSchema )
parse4
ParseJSON withSchema1.png
#parse4[Book='author']
ParseJSON withSchema2.png

The result uses the indexes, Book and PersonName, rather than the local indexes that you get if you don't provide a schema parameter.

When specifying a schema with multiple objects, you must give top-level «objectSchema» first.

The labels in the index must match the JSON object's member names exactly. It is case-sensitive. But the order of your index labels does not need to match the order in the JSON. You can include extra labels ("fields names") in your index. But it must contain a label matching every member used in the «json» data -- or it will give an error.

Member schema options

The following options can be used in a cell of an «objectSchema», each describing what is expected for the value of the corresponding member.

  • 'atom'; The text 'atom' specifies that the data for that member shall not be an object or an array. It can be text (surrounded by double quotes), a number, or the keywords: null, true or false.
  • Null: Any valid «json» is allowed, and the json appearing for that element is parsed without a schema.
  • Handle(index): A handle to an index specifies that a JSON-object is expected, with member names that match the elements of «index». If a schema for that index appears in «objectSchema», that that schema guides the parsing. The result for this member will be a reference to a 1-D array indexed by «index».
  • \ListOfHandles(I,J,K): A reference to a list of handles to indexes specifies that a JSON-array is expected here, and the indexes specify the indexes for the result, and the index order. The first index (i.e., «I») corresponds to the outermost index in the JSON array. When 2 or more indexes are listed, the final index can be either an array index or an object index. An object index is used when the «json» contains an array of objects.

Reading arrays with schema

The JSON standard expects the outermost object to be an object, so the schema always starts with the first «objectSchema». When a member contains an array, its schema should be a reference to a list of index handles containing the indexes for the result.

Variable json3 := '{ "data" : [ [ 1,2], [3,4], [5,6] ] }'
Index J := [1, 2, 3]
Index K := ['k1', 'k2']
Index D := ['Data']
Variable D_Schema := Table(D)( \ListOfHandles( J,K ) )
ParseJSON( json3, D_Schema )
ParseJSON withSchema3a.png
#ParseJSON( json3, D_Schema )[D='data]
ParseJSON withSchema3b.png

In this example, the JSON contains an array of books, so each item in the JSON-array is a book object.

Variable json5
Definition:
'{ "bibliography" :
    [ { "title" : "1984",
        "author" : { "first" : "George", "last" : "Orwell" },
        "year" : 1949,
        "pages" : 336,
        "paperback" : true
       },
       { "title" : "The Time Machine",
          "author" : { "first" : "H. G.", "last" : "Wells" },
          "year" : 1895,
          "pages" : 118,
          "paperback" : true
       }
    ]
 }'

The schema for BookSchema has a reference to a list of handles, indicating that an array result is expected, and specifying the indexes for the result. The last index listed is an object with an «objectSchema» (i.e., Book, where BookSchema is provided), thus encoding that an array of book objects is expected.

Index Biblio := ['bibliography']
Variable BookSchema := Table(Biblio)(\ListOfHandles(Book_Num, Book))
Index Book_Num := 1..2
Variable parse5 := ParseJSON( json5, BiblioSchema, BookSchema, NameSchema )
#parse5[Biblio='bibliography']
ParseJSON withSchema5.png

In this example, we had to know the number of books in advance. Your schema index(es)must be long enough to accommodate the data, or it will lose any data that doesn't fit. The index can be longer than needed, for example,

Index Book_Num := 1..1K

has space for one thousand books, even though only two appear in the «json». The excess slices along Book_Num contain Null. If you can't be sure that an index will be long enough, you should use schema-free parsing for that member -- i.e. put a Null in that member's schema.

See Also

Comments


You are not allowed to post comments.