Difference between revisions of "ParseJSON"

m
m
Line 20: Line 20:
 
== Parsing without a schema ==
 
== Parsing without a schema ==
  
When the «objectSchema» parameter is not specified,
+
The top-level item in a JSON text document should be a JSON object, for example:
:<code>[[ParseJSON]]( json )</code>
 
parses the  «json» data without using a schema.
 
 
 
=== Reading JSON objects ===
 
 
 
The top-level item in a JSON document should be a JSON-object. Here is an example of a JSON object.
 
 
  Variable json1  
 
  Variable json1  
 
  '''Definition: '''
 
  '''Definition: '''
Line 36: Line 30:
 
       }'
 
       }'
  
 +
Here is the result of parsing that JSON text without providing a schema:
 
:<code>[[ParseJSON]](json1)</code> &rarr;
 
:<code>[[ParseJSON]](json1)</code> &rarr;
 
::[[image:ParseJSON1.png]]
 
::[[image:ParseJSON1.png]]
  
Notice that a local index named <code>.Member</code> is automatically created.  The <code>'title'</code> is text, the <code>'year'</code> is a number.
+
The result has a local index named <code>.Member</code> identifying the fields of the object.  The value of the <code>'title'</code> is text because it was enclosed in quotes <code>"1984"</code>. The value in <code>'year'</code> is a number, because it was not enclosed in quotes:
 
:Variable parse1a := <code>[[ParseJSON]](json1)</code>
 
:Variable parse1a := <code>[[ParseJSON]](json1)</code>
 
:<code>TypeOf(parse1a[.Member='title']) &rarr; 'Text'</code>
 
:<code>TypeOf(parse1a[.Member='title']) &rarr; 'Text'</code>
 
:<code>TypeOf(parse1a[.Member='year']) &rarr; 'Number'</code>
 
:<code>TypeOf(parse1a[.Member='year']) &rarr; 'Number'</code>
  
When there are nested objects, local indexes are created at each level. To prevent the indexes from combining into a rectangular array, the member object is placed within a [[reference]].
+
=== JSON with nested objects and no schema ===
 +
 
 +
With a JSON document containing nested objects, it creates local indexes at each level. To prevent the indexes from combining into a rectangular array, it places each member object in a [[reference]].
  
 
  Variable json2  
 
  Variable json2  
Line 56: Line 53:
  
 
:Variable parse2 := <code>[[ParseJSON]](json2)</code>
 
:Variable parse2 := <code>[[ParseJSON]](json2)</code>
:<code>parse2</code> &rarr; [[image:ParseJSON2.png]]  
+
:<code>parse2</code> &rarr;  
:<code>#parse2[.Member='author']</code>  &rarr;  [[image:ParseJSON2b.png]]
+
::[[image:ParseJSON2.png]]  
 +
:<code>#parse2[.Member='author']</code>  &rarr;   
 +
::[[image:ParseJSON2b.png]]
  
 
=== Reading JSON arrays ===
 
=== Reading JSON arrays ===
Without using a schema, [[ParseJSON]] ''does not'' map array data to existing indexes that you might have. Without a schema, you have two options. In the first option, it will return arrays as lists and nested lists. In Analytica, to nest a list (and avoid having the implicit dimension combine with other indexes), the list is placed inside a [[reference]]. The first option is used unless the 1 bit of «flags» is set. The second option is to have [[ParseJSON]] created local indexes automatically, which are named <code>.Dim1</code>, <code>.Dim2</code>, etc., corresponding to the nesting level in «json». The second option produces a multi-dimensional array without nesting.
 
  
  Variable json3 := <code>'{ "data" : [ [ 1,2], [3,4], [5,6] ] }'</code>
+
With no schema, [[ParseJSON]] ''does not'' map array data to existing indexes that you might have. There are two ways to read that depending on whether you set the «flags» parameter. By default, if you don't specify «flags», it returns arrays as lists and any nested array data to references to lists (to avoid having the implicit dimensions combine with other indexes).
  Variable parse3 := <code>[[ParseJSON]](json3)</code>
+
  Variable json3 := <code>'{ "data": [ [ 1,2], [3,4], [5,6] ] }'</code>
:<code>#parse3[.Member='data']</code> &rarr; [[image:parse3.png]]
+
  Variable parse3 := <code>ParseJSON(json3)</code>
:<code>#Slice(#parse3[.Member='data'],3)</code> &rarr; [[image:ParseJSON3b.png]]
+
:<code>#parse3[.Member='data']</code> &rarr;  
 +
::[[image:parse3.png]]
 +
:<code>#Slice(#parse3[.Member='data'],3)</code> &rarr;  
 +
::[[image:ParseJSON3b.png]]
  
Variable parse3b := <code>[[ParseJSON]](json3, flags:1)</code>
+
If you set  used  «flags» to 1, [[ParseJSON]] creates local indexes, named <code>.Dim1</code>, <code>.Dim2</code>, etc., for each nesting level in «json», and produces a multi-dimensional array without nesting.
:<code>#parse3[.Member='data']</code> &rarr; [[image:ParseJSON3a.png]]
+
Variable parse3b := <code>[[ParseJSON]](json3, flags:1)</code>
 +
:<code>#parse3[.Member='data']</code> &rarr;  
 +
::[[image:ParseJSON3a.png]]
  
 
== Parsing with a schema ==
 
== Parsing with a schema ==

Revision as of 19:05, 5 September 2017


New to Analytica 5.0

This function requires the Analytica Enterprise edition or higher (e.g., Analytica Optimizer, ADE or CubePlan).

ParseJSON( json, objectSchema..., flags )

Parses JSON text in «json» to generate corresponding Analytica data and arrays. Usually, you will obtain the «json» text from a call to ReadTextFile or ReadFromUrl. It works without an «objectSchema». But you can specify an «objectSchema» to help map the structure of the data using indexes in your model.

JavaScript Object Notation (JSON) is a widely used lightweight data-interchange format. It is easy for humans and machines to read and write.

Parameters

  • «json»: A JSON-formatted text to parse.
  • «objectSchema»: A schema describing the JSON class structure and mapping it into Analytica index(es). See Parsing with a schema. If given no «objectSchema», see Parsing without a schema below.
  • «flags»: (optional) A bit field of flags that control various aspects of parsing. Bit settings are
    • 1 = During schema-free parsing, create local indexes .Dim1, .Dim2, ... for arrays. Without this, each level of an array is returned as a reference to a list.

Parsing without a schema

The top-level item in a JSON text document should be a JSON object, for example:

Variable json1 
Definition: 
    '{ "title" : "1984",
       "author" : "George Orwell",
       "year" : 1949,
       "pages" : 336,
       "paperback" : true
     }'

Here is the result of parsing that JSON text without providing a schema:

ParseJSON(json1)
ParseJSON1.png

The result has a local index named .Member identifying the fields of the object. The value of the 'title' is text because it was enclosed in quotes "1984". The value in 'year' is a number, because it was not enclosed in quotes:

Variable parse1a := ParseJSON(json1)
TypeOf(parse1a[.Member='title']) → 'Text'
TypeOf(parse1a[.Member='year']) → 'Number'

JSON with nested objects and no schema

With a JSON document containing nested objects, it creates local indexes at each level. To prevent the indexes from combining into a rectangular array, it places each member object in a reference.

Variable json2 
Definition: 
    '{ "title" : "1984",
       "author" : { "first" : "George", "last" : "Orwell" },
       "year" : 1949,
       "pages" : 336,
       "paperback" : true
     }'
Variable parse2 := ParseJSON(json2)
parse2
ParseJSON2.png
#parse2[.Member='author']
ParseJSON2b.png

Reading JSON arrays

With no schema, ParseJSON does not map array data to existing indexes that you might have. There are two ways to read that depending on whether you set the «flags» parameter. By default, if you don't specify «flags», it returns arrays as lists and any nested array data to references to lists (to avoid having the implicit dimensions combine with other indexes).

Variable json3 := '{ "data": [ [ 1,2], [3,4], [5,6] ] }'
Variable parse3 := ParseJSON(json3)
#parse3[.Member='data']
Parse3.png
#Slice(#parse3[.Member='data'],3)
ParseJSON3b.png

If you set used «flags» to 1, ParseJSON creates local indexes, named .Dim1, .Dim2, etc., for each nesting level in «json», and produces a multi-dimensional array without nesting.

Variable parse3b := ParseJSON(json3, flags:1)
#parse3[.Member='data']
ParseJSON3a.png

Parsing with a schema

A schema describes the data structure of a Java Script object, and specifies the index in your model that encodes the object.

JSON object schemas

The class structure for a JavaScript object is described an a 1-D array, where the index contains the member names, and the cell values describe the nested structure. For example, consider the earlier json2 data:

    '{ "title" : "1984",
       "author" : { "first" : "George", "last" : "Orwell" },
       "year" : 1949,
       "pages" : 336,
       "paperback" : true
     }'

This has a top-level JavaScript object (Book) and a nested JavaScript object (PersonName). We can encode these two «objectSchema» as follows.

Index Book := ['title', 'author', 'year', 'pages', 'paperback']
Index PersonName := ['first', 'last']
Variable BookSchema := Table(Book)('atom',Handle(PersonName),'atom','atom','atom')
Variable NameSchema := Array(PersonName,'atom')
Variable parse4 := ParseJSON(json2, BookSchema, NameSchema )
parse4ParseJSON withSchema1.png
#parse4[Book='author']ParseJSON withSchema2.png

Notice that the existing indexes (Book and PersonName) are used, rather than local indexes created by the function.

When specifying multiple object schema, the first «objectSchema» listed must be the top-level object.

The labels in the index must match the JSON object's member names exactly (case-sensitive), but the ordering of your index labels does not need to match the order in which the member values appear in the JSON. You can include extra member names in your index, but there every member that appears in the «json» data must appear in your index or an error will issue.

Member schema options

The following options can be used in a cell of an «objectSchema», each describing what is expected for the value of the corresponding member.

  • 'atom'; The text 'atom' specifies that the data for that member shall not be an object or an array. It can be text (surrounded by double quotes), a number, or the keywords: null, true or false.
  • Null: Any valid «json» is allowed, and the json appearing for that element is parsed without a schema.
  • Handle(index): A handle to an index specifies that a JSON-object is expected, with member names that match the elements of «index». If a schema for that index appears in «objectSchema», that that schema guides the parsing. The result for this member will be a reference to a 1-D array indexed by «index».
  • \ListOfHandles(I,J,K): A reference to a list of handles to indexes specifies that a JSON-array is expected here, and the indexes specify the indexes for the result, and the index order. The first index (i.e., «I») corresponds to the outermost index in the JSON array. When 2 or more indexes are listed, the final index can be either an array index or an object index. An object index is used when the «json» contains an array of objects.

Reading arrays with schema

The JSON standard expects the outermost object to be an object, so parsing with a schema always starts with the first «objectSchema». When a member contains an array, then the member schema should be a reference to a list of index handles. The indexes in that list specify the indexes for the resulting array.

Variable json3 := '{ "data" : [ [ 1,2], [3,4], [5,6] ] }'
Index J := [1, 2, 3]
Index K := ['k1', 'k2']
Index D := ['Data']
Variable D_Schema := Table(D)( \ListOfHandles( J,K ) )
ParseJSON( json3, D_Schema )ParseJSON withSchema3a.png
#ParseJSON( json3, D_Schema )[D='data]ParseJSON withSchema3b.png

In the next example, the JSON contains an array of books, so that each item in the JSON-array is an object.

Variable json5
Definition:
'{ "bibliography" :
    [ { "title" : "1984",
        "author" : { "first" : "George", "last" : "Orwell" },
        "year" : 1949,
        "pages" : 336,
        "paperback" : true
       },
       { "title" : "The Time Machine",
          "author" : { "first" : "H. G.", "last" : "Wells" },
          "year" : 1895,
          "pages" : 118,
          "paperback" : true
       }
    ]
 }'

The schema for BookSchema has a reference to a list of handles, indicating that an array result is expected, and specifying the indexes for the result. The last index listed is an object with an «objectSchema» (i.e., Book, where BookSchema is provided), thus encoding that an array of book objects is expected.

Index Biblio := ['bibliography']
Variable BookSchema := Table(Biblio)(\ListOfHandles(Book_Num, Book))
Index Book_Num := 1..2
Variable parse5 := ParseJSON( json5, BiblioSchema, BookSchema, NameSchema )
#parse5[Biblio='bibliography']ParseJSON withSchema5.png

In this example, had to know the number of books in advance. This is a limitation, in that your indexes that appear in the schema must have enough length to accommodate the data. It is acceptable to specify more entries than actually exist in the data, for example:

Index Book_Num := 1..1K

which has space for one thousand books, even though only two appear in the «json». In this case, the excess slices along Book_Num contain Null. When you cannot guarantee an index that is guaranteed to be long enough, then you will need to use schema-free parsing for that member (put a Null in that member's schema).

See Also

Comments


You are not allowed to post comments.