The Standard Method of Transposing Database Tables in JAVA

SQL Assistance 2392 0

Sometimes you need to transpose a database table (or a text file) in JAVA before exporting it out. Different types of transposition require different SQL techniques, and at times you have to do low-level programming in JAVA. That is quite difficult.

As esProc that serves as JAVA class library supports dynamic scripting, order-related calculations and set operations, it has the ability to realize data transposition in various scenarios using a standard method, that is, retrieving data, creating an empty result set and filling data in it. The JAVA application can access the esProc computing result through JDBC. More details can be found in How to Use esProc as the Class Library of Structured File Computing For Java.

These are common database table transpositions in JAVA programming and their solutions using standard esProc method.

Simple transposition
The database table SALES stores order data, a selection of which is shown below:

OrderID Client SellerId Amount OrderDate
1 DSG 13 19480 2014-06-20 00:00
2 ERN 18 979 2014-06-13 00:00
3 JFE 19 28972 2014-12-11 00:00
4 OFS 21 4829 2014-02-24 00:00
5 ERN 22 21392 2014-02-01 00:00

You need to calculate the total order amount, the maximum/minimum order amount and the total number of orders each month in a specified year, and transpose the data into 13 columns and 4 rows. That is, the four algorithms constitute the first column named subtotal and each month is a column named 1, 2, 3, 4…

esProc script:

  A
1 $select  month(ORDERDATE) as MONTH,sum(AMOUNT) as OSum,max(AMOUNT) as OMAX, min(AMOUNT) as OMIN ,count(ORDERID) as OCount from sales where year(ORDERDATE)=? group by MONTH order by MONTH; argYear
2 =create(subtotal,${to(A1.len()).string()})
3 >A1.fno().to(2,).run(A2.record(A1.fname(~)|A1.field(~)))

A1: Retrieve data and perform simple group and aggregate in SQL. Here’s the result:

esProc_as01a_1

A2: Create an empty result set containing fields “subtotal,1,2,3,4,5,6,7,8,9,10,11,12”. set.string() concatenates members of a set into a string delimited by commas; ${} parses a string as an expression and execute it dynamically.

A3: Fill in records to the empty result set one by one. The fno function gets the number of fields; the fname function gets field names by sequence numbers; the field function gets a column by the sequence number; the run function loops through a set\result set; and the record function appends records. Here’s the result of A2 after data appending:

esProc_as01a_2

According to this example, there are three steps to handle data transposition: retrieve data in SQL, create an empty result set using create function, and loop through the source data using run function and append records one by one using record function. The code for data appending will change a little in different scenarios. The following will explain each scenario in detail.

Dynamic transposition
The liquors table stores the production area, types and stock of liquors. Below is a selection from the source data:

esProc_as01a_temp1

esProc script:You need to calculate the stock for every type of liquor from every production area, with each type corresponding to a field.

  A
1 $select Production, Type,sum(Stock)Amount from liquors group by Production,Type
2 =A1.id(Type)
3 =create(Production,${A2.string()})
4 =A1.group(Production)
5 >A4.run(A3.record(~.Production|~.align@a(A2,Type).(Amount)))

It can be handled still in steps of data retrieval, empty result set creation and data appending. A2 gets the names for the second to the Nth fields of the result set; the id function is equal to SQL distinct function. A4 groups A1’s data by Production. A5 loops through each group to append one record to A3 each time; the align function groups data according to a specified sequence and allows null members in a group.

A3 stores the final result, as shown below:

esProc_as01a_3

Transposing multiple columns to one row
Below is the format of the Students table:

esProc_as01a_temp2

You need to combine all records into one row in a format as shown below:

esProc_as01a_temp3

esProc script:

  A
1 $select name,age,sex,kg from student
2 =create(${A1.(name).(~+”age,”+~+”sex,”+~+”kg”).string()})
3 >A2.record(A1.(string(age)+”,”+sex+”,”+string(kg)).string().array())

A2 creates an empty table sequence dynamically. A3 composes A1 into a single record and appends it to A2. string(field) means converting data of other types into the string type data. set.string() means concatenating members of a set into a string. You can also append records with this line of code: >A2.record(A1.conj(~.array().to(2,4)))

Multi-group transposition

In the database table kpi, every 4 records with the same f_site value is a group. You need to transpose each group of data to transform dataset_date values to fields and ioh_kpi、idh_kpi、iol_kpi fields to values of KPI Name field. Below is a selection from the source data:

esProc_as01a_temp4

esProc script:

  A B
1 =myDB1.query(“select * from kpi order by f_site,dataset_date”)
2 =A1.id(dataset_date) =A1.fname().to(3,)
3 =create(site,’KPI Name’,${A2.string()})  
4 for A1.group(f_site) = A4.align(A2,dataset_date)
5   >B2.run(A3.record([A4.f_site,~]|B4.field(~)))

A2 gets the distinct dataset_date values, that is [“2015-04-21 13:15″,”2015-04-21 13:30″,”2015-04-21 13:45″,”2015-04-21 14:00”]. B2 gets field names of A1 starting from the third field, i.e. [“ioh_kpi”,”idh_kpi”,”iol_kpi”]. A3 creates a two-dimensional table dynamically, with field names being site,KPI Name,”2015-04-21 13:15″,”2015-04-21 13:30″,”2015-04-21 13:45″,”2015-04-21 14:00″. A4-B5 use a loop statement to append data; the statement is equal to A1.group(f_site).run(…), but it is more clear than run function when steps are many. Here’s the result:

esProc_as01a_4

Inverse transposition
A query finds that the tb1 table has one record with many fields, as shown below:

esProc_as01a_temp5

You need to transpose the table into a two-dimensional table with three columns and multiple rows, as shown below:

esProc_as01a_temp6

esProc script:

  A
1 $select * from tb1
2 =create(project,operator,actionTime)
3 >((A1.fno()-1)/2).run(A2.record(A1.field(1)|A1.field(2*~)|A1.field(2*~+1)))

((A1.fno()-1)/2) calculates the number of records that need to be added to the result set and run function runs a loop to append them.

If there is more than one record in the tb1 table, you need to use the following code:

  A
1 $select * from tb1
2 =create(project,operator,actionTime)
3 =((A1.fno()-1)/2)
4 >A1.run(A3.run(A2.record(A1.~.field(1)|A1.~.field(2*~)|A1.~.field(2*~+1))))

Here’s the result:

esProc_as01a_5

Appending the subtable with irregular length into the main table
Table1 and Table2 are the main table and the subtable that are associated through ID field. Now you need to append the subtable as fields of the main table. The subtable has different numbers of rows with the same ID value to correspond one row in the main table.

Below is a selection from Table1:

esProc_as01a_temp7

Below is a selection from Table2:

esProc_as01a_temp8

The desired format:

esProc_as01a_temp9

esProc script:

  A
1 =myDB1.query(“select  t1.ID,t1.Prob,t1.Cost,t1.Visible,t2.Item,t2.Count from table2 t2 left join table1 t1 on t1.ID=t2.ID”)
2 =A1.group(ID)  
3 =A2.max(~.count()).(“item”+string(~)+”,”+”Count”+string(~)).string
4 =create(ID,Prob,Cost,Visible,${A3})
5 >A2.run(A4.record([ID,Prob,Cost,Visible]|~.([Item,Count]).conj()))

Join the main table and the subtable using a SQL statement; group the data by ID; create an empty table sequence dynamically in A4 according to the group with the maximum number of records; and then concatenate the records in the main table and field values in the subtable into new records and append them to A4. Unlike a SQL group operation, esProc group function is able to group data without aggregating it.

A4 gets the final result as follows:

esProc_as01a_6

Inserting the subtable with specified maximum length dynamically into the main table
dColThread and dColQuestion are the main table and the subtable that are related through tID field. Each record of the main table corresponds to multiple but less than 5 status field values. You need to insert the status values between Phone and Decline fields and name them QuestionNo1、QuestionNo2…QuestionNo5.

Below is a selection of dColThread:

esProc_as01a_temp10

Below is a selection of dColQuestion:

esProc_as01a_temp11

esProc script:

  A B
1 =myDB1.query(“select * from dColThread t,dColQuestion q where t.tID=q.tID and t.ApplicationName=?”,arg1 )
2 =A1.group(tID)  
3 =create(ApplicationName,User,Phone,QuestionNo1,QuestionNo2,QuestionNo3,QuestionNo4,QuestionNo5,Decline)
4 for A2 =A4.(status)|[“”,””,””,””,””]
5   =A3.record(A4.ApplicationName|A4.User|A4.Phone|B4.to(5)|A4.Decline)

Create a two-dimensional table with fixed fields in A3; loop through each of A2’s groups to get the status value and complement the missing values to reach five; and then append the complete records to A3.

A3 gets the final result as follows:

esProc_as01a_7

Complementing the missing months before transposition
The tb table, with two fields – time and quantity, contains the product’s sales quantity per day. In certain months, the quantity value may be null. Below is a selection from the source data:

esProc_as01a_temp12

You need to rearrange the tb table into 12 rows, with fields including the fixed months (values are 1-12), the unfixed annual sales quantity (if a time period involves 2 or 3 consecutive years, create 2 or 3 fields), as shown below:

Month Quantity in 2013 Quantity in 2014 Quantity in …

esProc script:

  A
1 $select year(time) Year,month(time) Mon,sum(quantity) Amount from tb group by Year,Mon
2 =A1.id(Year)
3 =A1.align@a(12,Mon)
4 =create(m,${A2.(“Amount_for_”+string(~)).string()})
5 >A3.run(A4.record(#|A2.(ifn(A3.~.select@1(Year==A2.~).Amount,0))))

A1 performs a SQL group and aggregate; A2 gets a list of years; A3 groups A1 according to the sequence of 12 months; A4 dynamically creates an empty two-dimensional table; A5 loops through every group of A3 and appends one record each time. Below is the result of A3:

esProc_as01a_8

A4 gets the final result, as shown below:

esProc_as01a_9

Transposition with computed columns
Below is a selection from the table tb1:

esProc_as01a_temp13

You need to generate the total number of projects before a specified date in the current month. For instance, enter 2015-01-03 and here’s the desired result:

esProc_as01a_temp14

esProc script:

  A
1 $select * from tb1 where ddate>=? and ddate<=?; pdate@m(d_date) ,d_date
2 =create(Prjno,Subtask,${periods(pdate@m(d_date),d_date,1).string()})
3 =A1.group(Prjno,Subtask)
4 >A3.run(A2.record(Prjno|Subtask|~.groups(Ddate;sum(Num):Num).(Num)))

Query the source data and group it; loop through every group to insert one record to the empty two-dimensional table each time.

Transposition with dynamic locating
In the database table tb, every 3 records with the same userid are a group. You need to transform these groups into rows. Below is a selection from the table tb:

esProc_as01a_temp15

The desired result:

esProc_as01a_temp16

The align function aligns data with members of a set ([‘hair’,’weight’]); @n means placing the unaligned data in a separate row. As with this example, such a row is composed of records corresponding to mouse and dog. Here’s the result:

esProc_as01a_10

Column-to-row transposition with three related tables

There are three tables – Students table, Exam table and Retest table – which are related between each other through stu_id, as shown below:

Students

esProc_as01a_temp17

Exam

esProc_as01a_temp18

Retest

esProc_as01a_temp19

You need to query the three tables to get the score of each subject, the total score and the retest score for every student. Below is the desired result:

esProc_as01a_temp20

esProc script:

  A
1 $select t1.stu_id,t1.stu_name,t2.subject,t2.score score1,t3.score score2 from stutents t1 left join exam t2 on  t1.stu_id=t2.stu_id left join retest t3 on t1.stu_id=t3.stu_id and t2.subject=t3.subject order by t1.stu_id,t2.subject
2 =A1.group(stu_id)
3 =A1.group(subject)
4 =create(stu_id,stu_name,${(A3.(~.subject+”_score”)|”scores(scoresSum)”|A3.(~.subject+”Retest “)).string()})
5 >A2.run(A4.record([stu_id,stu_name]|A3.(~(A2.#).score1)|A2.sum(score1)|A3.(~(A2.#).score2)))

 

FAVOR (0)
Leave a Reply
Cancel
Icon

Hi,You need to fill in the Username and Email!

  • Username (*)
  • Email (*)
  • Website