Examples of Handling JSON Data with esProc

Blog 1251 0

JSON format multilevel semi-structured data is commonly seen in internet applications. Java provides just the class library for parsing JSON data, but to perform in-depth calculations, complex hardcoding is required.

esProc supports set-operations, order-related calculations and dynamic script execution, so it can be used as the class library for Java to make JSON data manipulation much easier. The Java application can access the esProc result through JDBC. For details, see How to Use esProc as the Class Library for Java.

Following is the typical JSON data manipulation cases, and their esProc solutions.

Grouping and aggregating JSON data

order.JSON contains the ordering records. The requirement is to calculate the sales amount that each client contributes per month in a specified time period. Below is a selection of the source data:

[{OrderID:”26″,Client:”TAS”,SellerId:”1″,Amount:2142,OrderDate:”05-08-2009 00:00:00″},{OrderID:”33″,Client:”DSGC”,SellerId:”1″,Amount:613,OrderDate:”14-08-2009 00:00:00″},

{OrderID:”84″,Client:”GC”,SellerId:”1″,Amount:89,OrderDate:”16-10-2009 00:00:00″},

{OrderID:”133″,Client:”HU”,SellerId:”1″,Amount:1420,OrderDate:”12-12-2010 00:00:00″},

{OrderID:”32″,Client:”JFS”,SellerId:”3″,Amount:468,OrderDate:”13-08-2009 00:00:00″}…

The code:

  A
1 =file(“D:\\order.JSON”).read().import@j()
2 =A1.select(OrderDate>=argBegin && OrderDate<=argEnd)
3 =A2.groups(month(OrderDate):Month,Client;sum(Amount):subtotal)

Import the JSON file as a two-dimensional table and perform a conditional query and then the grouping and aggregation. argBegin and argEnd are parameters. Here’s the result:

esProc_text_json_1

Handling JSON format HTTP flows

testServlet returns JSON format strings of employee information. Now query the data according to the given condition and return result to Java main program in JSON format.

  A
1 =httpfile(“http://localhost:6080/myweb/servlet/testServlet?table=employee&type=json”)
2 =A1.read()
3 =A2.import@j()
4 =A3.select(${where})
5 =export@j(A4)

Read in the http file object and parse the JSON format strings to generate a two-dimensional table; and then filter the table according to the condition and export the result as JSON format strings. The parameter where is the dynamic querying condition, like BIRTHDAY>=date(1981,1,1) && GENDER==”F”.

Converting cell phone logs into structured data

The cellphone.log file contains the JSON format cell phone log, in which general field and ping field are subdocuments that each has one record. Another subdocument cap_main has multiple records. The requirement is to display cap_main in a list table, with each record appended by general’s mtmodel field and networktype field and ping’s ssid field.

Below is the source data:

{“cap_main”:[{“pci”:0,”snr”:0,”rxlev”:-81,”locationtype”:161,”frequency”:0,”dl_speed”:0,”longitude_baidu”:103282882,
“networktype”:”WIFI”,”lac”:35042,”capid”:1395672278839,”lanip”:”192.168.1.129″,”longitude_wgs84″:0,
“bssid”:”00:14:d5:00:21:1a”,”dstype”:”EDGE”,”time”:”2014-03-24 22:44:38″,

“encrypttype”:””,”ci”:52532,”ssid”:”\”open\””,”radius”:71.69451141357422,”rssi”:-47,”ul_speed”

:0,”channelnumber”:0,”latitude_wgs84″:0,”latitude_baidu”:24770267},{“pci”:0,”snr”:0,”rxlev”:

-81,”locationtype”:161,”frequency”:0,”dl_speed”:0,”longitude_baidu”:103282882,”networktype”

:”WIFI”,”lac”:35042,”capid”:1395672279839,”lanip”:”192.168.1.129″,”longitude_wgs84″:0,

“bssid”:”00:14:d5:00:21:1a”,”dstype”:”EDGE”,”time”:”2014-03-24 22:44:39″,”encrypttype”:””,”ci”:52532,”ssid”:”\”open\””,”radius”:71.69451141357422,”rssi”:-47,”ul_speed”:0,

“channelnumber”:0,”latitude_wgs84″:0,”latitude_baidu”:24770267}…

The code:

  A
1 =file(“D:\\ cellphone.log”).read().import@j()
2 =A1.cap_main.derive(A1.general.mtmodel,A1.general.networktype,A1.ping.ssid)

esProc can read in the JSON file while keeping its multilevel structure, and then it can use periods to access fields through the multiple levels. The derive function adds the computed columns. Here’s the result:

esProc_text_json_2

Querying multilevel JSON subdocuments

JSONstr.JSON’s runners field is a subdocument that has 3 fields: horseId, ownerColours and trainer, in which trainer has a subfield trainerId. The requirement is to query the subdocument’s horseId, ownerColours and trainer fields according to the corresponding document’s sequence number.

Below is the source data:

 [

    {

        “race”: {

            “raceId”: “1.33.1141109.2”,

            “startDate”: “2014-11-09T13:15:00.000Z”,

            “raceClassification”: {

                “classification”: “Novices'”

            },

            “raceType”: {

                “key”: “H”

            },

            “raceClass”: 4,

            “course”: {

                “courseId”: “1.33”

            },

            “meetingId”: “1.33.1141109”

        },

        “numberOfRunners”: 2,

        “runners”: [

            {

                “horseId”: “1.00387464”,

                “trainer”: {

                    “trainerId”: “1.00034060”

                },

                “ownerColours”: “Maroon, pink sleeves, dark blue cap.”

            },

            {

                “horseId”: “1.00373620”,

                “trainer”: {

                    “trainerId”: “1.00010997”

                },

                “ownerColours”: “Black, emerald green cross of lorraine, striped sleeves.”

            }

        ]

    },

……

]

 

The code:

1 =file(“D: \\JSONstr.JSON”).read().import@j()
2 =A1(which).runners
3 =A2.new(horseId,trainer.trainerId:trainerId,ownerColours)

Import the JSON file, retrieve the document’s runners field (the subdocument), and get the desired fields from runners. trainerId comes from a child document of the subdocument. A3 gets the result as follows:

esProc_text_json_3

Converting multilevel JSON to structured data

Cells.JSON is a multilevel nested JSON file which needs to be converted to a two-dimensional structured table according to the grouping fields name, type and image. “xlink:href”. The field containing detailed data has 3 subdocuments: custom. Identifier, custom. Classifier and custom. Output, which are of the same structure but contain different number of documents each.

Below is the source data:

{    “cells”: [

        {

            “name”: “b”,

            “type”: “basic.Sensor”,

            “custom”: {

                “identifier”: [

                    {

                        “name”: “Name1”,

                        “URI”: “Value1”

                    },

                    {

                        “name”: “Name4”,

                        “URI”: “Value4”

                    }

                ],

                “classifier”: [

                    {

                        “name”: “Name2”,

                        “URI”: “Value2”

                    }

                ],

                “output”: [

                    {

                        “name”: “Name3”,

                        “URI”: “Value3”

                    }

                ]

            },

            “image”: {

                “width”: 50,

                “height”: 50,

                “xlink:href”:

“

SVQICAgIfAhkiAAAAAlwSFlzAABEJAAARCQBQGfEVAAAABl0RVh0U29mdHdhcmUAd3Vi8f+k/ERE

URQtsda2Or/+nFLqP6T5Ecdi0aJFL85msz2Qxyf4JIumMAx/ClmWt23GmL1kO54CXANAVH+WiN4

Sx7EoNVkU3Z41BDHMeXAxjvOxNr7RJjzHX7S/jAflwBxkJr/RwiOpWZ883Nzd+Wpld7tkBr/SJr7ZHZ

bHZeuVweSnPfniocMAWYwcGBafH0OoPamFGAaY4ZBZjmmFGAaY4ZBZjmmFGAaY4ZBZjmmFGA

aY7/B94QnX08zxKLAAAAAElFTkSuQmCC”

            }

        },

       ……

    ]

}

 

The code:

  A
1 =file(“D:\\cells.JSON”).read().import@j()
2 =A1.cells.conj(custom.(identifier.new(name:cname,URI:cURI,”identifiler”:cType,A1.cells.name:name,A1.cells.type:type,
A1.cells.image.’xlink:href’:image)|classifier.new(name:cname,URI:cURI,”classifier”:cType,
A1.cells.name:name,A1.cells.type:type,A1.cells.image.’xlink:href’:image)|output.new(name:cname,
URI:cURI,”output”:cType,A1.cells.name:name,A1.cells.type:type,A1.cells.image.’xlink:href’:image)))

esProc merges the three subdocuments into a single two-dimensional table, gives them a new field name ctype to be identified and appends the grouping fields to the table to finally generate a typical “table with subtables”. Below is the concatenation result of A2:

esProc_text_json_4

Exporting the multilevel JSON file as CSV file

Below is the original json file:

{  “Title” : {

      “name” : “ABC”,

      “id” : “1”,

      “job”: “Teacher”   

      },

      “Circle”:{

                “area”:”2R”

        },

        “Triangle”:{

                “length”:”45″

        }

}

 

The expected CSV file will present the source data in two columns, as shown below:

Title

 

name

ABC

id

1

job

Teacher

Circle

 

area

2R

Triangle

 

length

45

 

The difficulties are that the root field names should be combined with the subfield names and that the subfield names and values should each be presented in a separate column. Here’s the code:

1 =file(“d:\\source.json”).read().import@j()
2 =A1.fno().(pjoin([f=A1.field(~)].fname():key,f.array():value))
3 =A2.(~.record@i([A1.fname(#),null],1))
4 =A3.conj()
5 =file(“D:\\result.csv”).export@c(A4)

Read in the JSON file and divide data into 3 groups according to the 3 root documents, as shown below (A2):

esProc_text_json_5

Next, insert [root field name,null] into each group as the first record and concatenate all the groups of records and finally export them. The exported CSV file is as follows:

esProc_text_json_6

Joining a txt file and a JSON file

structure.txt is a tab-separated structured file. JSON.txt contains unstructured JSON strings. The second field of structure.txt and part of the json.txt are related through the foreign key. The requirement is to join the two files into a two-dimensional table. Below are selections of them.

structure.txt

Name1     BBBBBBBBBBBB     99.40        166 1        0       1       166 334 499 3e-82   302

Name2     DDDDDDDDDDDD 98.80        167 2        0       1       167 346 512 4e-81   298

JSON.txt

[

    { “Cluster A”: { “member”: { “Cluster A”: “BBBBBBBBBBBB This is Animal A” }, “name”: “Cluster A” } },

    { “Cluster B”: { “member”: { “Cluster B”: “DDDDDDDDDDDD This is Animal B” }, “name”: “cluster B” } }

]

The code:

  A
1 =file(“D: \\JSON.txt”).read().import@j()
2 =A1.new(#1.name:name,#1.member.(#1):cluster)
3 =A2.derive(left(cluster,(firstblank=pos(cluster,” “)-1)):key,right(cluster,len(cluster)-firstblank):value)
4 =file(“D:\\ structure.txt”).import()
5 =join(A4,_2;A3,key)
6 =A5.new(_1._1,_1._2,_1._3,_1._4,_1._5,_1._6,_1._7,_1._8,_2.name,_2.value)

Read in the JSON file, retrieve the desired fields and subdocument to create a two-dimensional table and append a computed column to it. Below is A3’s result:

esProc_text_json_7

Then import the text file as a table, perform a join between the table and A3 and retrieve the desired fields to generate the final result A6, as shown below:

esProc_text_json_8

Dynamically exporting JSON data to the database

s.json contains multilevel subdocuments, in which the LIST subdocument and SERVICES subdocument each have a variable number of fields. The requirement is to write s.json into two database tables – groups and service – by filling the values of GROUPNAME and the names of fields under SERVICES in the database table groups, and entering the detailed fields under SERVICES into the database table service. The two database tables are related through groupid.

Below is the source data:

{

    “SUCCESS”: [

        {

            “MESSAGE”: “IMEI Service List”,

            “LIST”: {

                “MOVISTAR SPAIN”: {

                    “GROUPNAME”: “MOVISTAR SPAIN”,

                    “SERVICES”: {

                        “3”: {

                            “SERVICEID”: 32,

                            “SERVICENAME”: “MOVISTAR NOKIA INSTANTE”,

                            “CREDIT”: 4,

                            “TIME”: “1-30 Minutes”,

                            “INFO”: “<p style=\”text-align: center;\”>…… </p>”,

                            “Requires.Network”: “None”,

                            “Requires.Mobile”: “None”,

                            “Requires.Provider”: “None”,

                            “Requires.PIN”: “None”,

                            “Requires.KBH”: “None”,

                            “Requires.MEP”: “None”,

                            “Requires.PRD”: “None”,

                            “Requires.Type”: “None”,

                            “Requires.Locks”: “None”,

                            “Requires.Reference”: “None”

                        },

                        “8”: {

                            “SERVICEID”: 77,

                            “SERVICENAME”: “MOVISTAR NOKIA 20 NCK”,

                           …

The code:

  A B C D
1 =file(“E:\\s.json”).read().import@j()
2 =create(Groupname,groupid)
3 =create(Serviceid,Servicename,groupid,Credit,Time,INFO,Network,Mobile,Provider,PIN,KBH,MEP,PRD,Type,
Locks,Reference)
4 for A1.SUCCESS =A4.LIST  
5   for B4.fno() =B4.field(B5)
6     =C5.SERVICES
7     for C6.fno() =C6.fname(C7)
8       =C6.field(C7)
9       =A2.record([C5.GROUPNAME,D7])
10       =A3.record([D8.#1,D8.#2,D7,D8.#3,D8.#4,D8.#5,D8.#6,D8.#7,D8.#8,
D8.#9,D8.#10,D8.#11,D8.#12,D8.#13,D8.#14,D8.#15])
11 =mssql.update(A2,groups,Groupname,groupid;groupid)
12 =mssql.update(A3,services,Serviceid,Servicename,groupid,Credit,Time,INFO,Network,Mobile,Provider,PIN,KBH,MEP,
PRD,Type,Locks,Reference;Serviceid)

Read in the JSON file while retaining its original structure; loop through each subdocument of SUCCESS (B-D10), each field of LIST (C5-D10) and each field of SERVICES (D7-D10) to append the desired records to the empty two-dimensional tables A2 and A3 respectively, and then export the tables to the database. In the above code, several functions for accessing the structural elements of a two-dimensional table are used. The fno function finds the number of fields, the fname function gets the field names by serial numbers, and the field function gets the field values by serial numbers.

Below are A2 and A3, with desired records appended

esProc_text_json_9

FAVOR (0)
Leave a Reply
Cancel
Icon

Hi,You need to fill in the Username and Email!

  • Username (*)
  • Email (*)
  • Website