Diverse Scenarios of esProc’s Assisting in Report Development

Blog 1477 0

There are many knotty problems around report development. These include complex layout, which is difficult to build with a reporting tool alone; complex data sources requiring dynamic access and involving different types of databases or non-database data; as well as the handling of order-related computation, subsets of grouped data, fixed grouping, string splitting and concatenation, date intervals and other complicated computations, all of which you may find hard to deal with in SQL.

But there is nothing difficult about these problems with esProc (free edition is available). esProc encapsulates lots of functions for processing structured data and supports dynamic parsing of expressions, hybrid computation with multiple data sources, order-related computation, set operations, irregular grouping and post-grouping computation. It also provides simple and easy-to-use JDBC interface, through which a reporting tool passes in parameters, executes esProc script the same way as it executes a database stored procedure and gets the returned result set.

Below is the structure of integrating an esProc script and a reporting tool:

esProc_report_1

The various cases in the following will illustrate how esProc helps a reporting tool with report development. 

Data transposition

Sometimes you might want to display data in a cross-tab or grouped report, but you could not use the source data directly. The data needs to be transposed first. With either SQL or the stored procedure, you must write code from the lowest level using a high-level language, which involves substantial amount of work and is really hard. esProc can remedy this failing through its support for order-related computation and dynamic scripting. Here is a simple example of transposing rows and columns.

Below is a selection of the database table SALES that stores order data:

esProc_report_2

You need to calculate the total order amount, the maximum/minimum order amount and the total number of orders each month in a specified year, and transpose the data into 13 columns and 4 rows. That is, the four algorithms constitute the first column named subtotal and each month is a column named 1, 2, 3, 4…

esProc script:

 

A

1

=db.query(“select  month(ORDERDATE) as MONTH,sum(AMOUNT) as OSum,max(AMOUNT) as OMAX, min(AMOUNT) as OMIN ,count(ORDERID) as OCount from sales where year(ORDERDATE)=? group by MONTH order by MONTH”,argYear)

2

=[“OSum”,”OMAX”,”OMIN”,”OCount”].new(~:subtotal,${to(A1.len()).string()})

3

=A1.run(A2.field(#+1,OSum|OMAX|OMIN|OCount))

4

return A2

Explanation: First perform group and aggregate operations in SQL, and then run a loop to transpose every row to column, and fill in values in sequence to each column of A2. Here is the result:

esProc_report_4

A4 returns A2 explicitly to the reporting tool (by default the last calculation cell will be returned). A reporting tool will call an eProc script through JDBC, as it calls the stored procedure of a normal database. The syntax is call script name (Para1…ParaN). The result returned from the esProc script will participate in report creation in the form of a normal data set. Details are covered in the following documents: esProc Integration & Application: Integration with JasperReport and esProc Integration & Application: Integration with BIRT.

Though the transposition in this example is simple, there are other unusual layouts such as post-grouping multi-layer transposition and dynamic location transposition. You can refer to esProc Arranges Data for Reports with Unconventional Layouts for more.

Multi-column layout

Most reporting tools support vertical layout but they really have difficulties in building unusual layouts such as arranging data in multiple columns horizontally or in an interlocking way. Normally you have to turn to high-level languages to work them out. With order-related computation and dynamic scripting features, esProc can help manage this well. Let’s take a look at horizontal multi-column layout:

The database table emp has three fields. Below is a selection of the table:

esProc_report_5

The desired layout is that data is presented in a horizontally sequenced order in 3 columns, as shown below:

esProc_report_6

You can first transform the original 3-field table to a 9-field table in esProc, and then build the above layout with the reporting tool. The code is as follows:

 

A

B

C

1

=myDB1.query(“select EId,Name,Dept from emp where EId>=? and EId<=? order by EId “,begin,end)

2

=A1.step(3,1) =A1.step(3,2)|[null] =A1.step(3,3)|[null]

3

=A2.derive(B2(#).EId:EId2,B2(#).Name:Name2,B2(#).Dept:Dept2,C2(#).EId:EId3,C2(#).Name:Name3,C2(#).Dept:Dept3)

Explanation: Divide rows into three parts by their sequence numbers and store them respectively in A2, B2 and C2. And then concatenate the fields in B2 and C2 with those in A2. Here is the result you’ll get:

esProc_report_8

Apart from the horizontal multi-column layout, there are other uncommon report layouts, like interlocking multi-column layout, dynamic vertical multi-column layout based on a single field, duplicating rows in sequence and condition-controlled formats for grouped reports. More information can be found in esProc Arranges Data for Reports with Unconventional Layouts.

Unconventional data sources

In addition to conventional databases, data sources of a reporting tool could be JSON files, MongoDB, txt files, Excel and HDFS files. Some of them do not have any computability and some have not enough. Both cases require writing a lot of code for report development. esProc supports various data sources and is capable of helping a reporting tool out. For example, you might want to display data from a multi-level JSON file in a grouped report:

Cells.json is a multi-level, nested JSON file which you want to present as a grouped report. The grouping fields are name, type and image. “xlink:href”. There is also a field with 3 subdocuments in each document: custom. Identifier, custom. Classifier and custom. Output, which are of the same structure but contain different number of documents.

Source data:

esProc_report_9

esProc script:

 

A

1

=file(“D:\\cells.json”).read().import@j()

2

=A1.cells.conj(custom.(identifier.new(name:cname,URI:cURI,”identifiler”:cType,A1.cells.name:name,A1.cells.type:type,A1.cells.image.’xlink:href’:image)|

classifier.new(name:cname,URI:cURI,”classifier”:cType,A1.cells.name:name,A1.cells.type:type,A1.cells.image.’xlink:href’:image)|

output.new(name:cname,URI:cURI,”output”:cType,A1.cells.name:name,A1.cells.type:type,A1.cells.image.’xlink:href’:image)))

Explanation: Merge the three subdocuments into one two-dimensional table, represent them with a new field name cType and append grouping fields to the table. Thus a typical “table with subtables” has been created. The result is as follows:

esProc_report_11

According to this esProc result, it’s easy to create a grouped report with a reporting tool.

There are more similar cases, including querying a multi-level subdocument in a JSON file, subreports using different JSON files, MongoDB collection join, and etc. You can find them in esProc Integrates Heterogeneous Data Sources for Report Development.

Dynamic data sources

Sometimes you might want to reference data sources dynamically through a parameter, merge data sources into one, or dynamically pass data source name to a subreport or Table control. Often reporting tools – especially those supporting single data source, such as BIRT and Jasper – have to use a high-level language to accomplish these computations, or trade security for reduced complexity.

esProc can help to perfect these computations, thanks to its support for dynamic parsing of expressions and for heterogeneous data sources. The following explains esProc way of switching data sources dynamically through parameter:

myDB1 and oraDB are data sources that point to different databases. Each holds a sOrder table with the same structure. The report requires connecting to data sources dynamically via a parameter, querying sOrder for orders whose amounts are greater than 1,000, and displaying them.

Below is a selection of sOrder in myDB1:

esProc_report_12

Below is a selection of sOrder in oraDB:

esProc_report_13

esProc script:

 

A

1

=${pSource}.query(“select * from sOrder where Amount>?”,pAmount)

Explanation: Both pSource and pAmount are report parameters. pSource represents the data source name; ${…} indicates parsing a string or a string variable into an expression. pAmount stands for the order amount.

When pSource=“myDB1”, A1 has the following result:

esProc_report_15

When pSource=“oraDB”, A1 gets this result:

esProc_report_16

More similar cases, including multi-data-source pre-join, result sets union, different data sources for main report and subreport, dynamic join between main report and subreports and displaying data by different time units specified by parameter, can be found in esProc Implements Dynamic Data Sources for Reporting Tools.

Heterogeneous data sources

Reporting tools can normally handle a type of single data source. To present the result of hybrid computation involving different types of data sources, the only choice is using a high-level language or ETL tools. The process is extremely tedious. esProc is more suitable for doing this because it supports the hybrid computation. Let’s look at the esProc approach through the join of MongoDB and MySQL:

emp1 is a MongoDB collection, whose CityID field is the logical foreign key pointing to CItyID field of cities, a MySQL table that has two fields – CityID and CityName. You need to query employee records from emp1 according to specified time period and switch its CityID field to CityName of cities.

esProc script:

 

A

1

=MongoDB(“mongo://localhost:27017/test?user=root&password=sa”)

2

=A1.find@x(“emp1″,”{‘$and’:[{‘Birthday’:{‘$gte’:'”+string(begin)+”‘}},{‘Birthday’:{‘$lte’:'”+string(end)+”‘}}]}”,”{_id:0}”)

3

=myDB1.query(“select * from cities”)

4

=A2.switch(CityID,A3)

5

=A4.new(EID,Dept,CityID.CityName:CityName,Name,Gender)

Explanation: Query the two data sources using MongoDB syntax and SQL syntax respectively, and then use esProc switch function to establish foreign key association between them, and finally retrieve the desired fields. Here is the result:

esProc_report_18

Besides, you can perform hybrid computation between JSON, Excel and txt, or between a database and a non-database. More details can be found from esProc Integrates Heterogeneous Data Sources for Report Development.

Parallel data retrieval

Sometimes a considerable amount of data needs to be retrieved from the database for presentation or export. But data retrieval with JDBC is always slow – maybe even slower than other computations involved in the report development. This causes the extreme inefficiency of the development. In contrast, esProc can realize data retrieval with multiple threads in parallel using simple code, which solves the JDBC problem. The following is such an example:

The user status table stores detail data – about 3,600,000 records. You need to present it in report and print it out. The desired layout is like this:

esProc_report_19

esProc script:

 

A

B

1

2020000000L The maximum value of Userid

2

4 Set 4 threads

3

=A1/A2 The number of userids a thread handles

4

=A2.(A3*(~-1)) The starting value of userids a thread handles

5

fork A4  

6

  =connect(“wm”)

7

  =B6.query(“select * from t_dw_zx_account_status_day where userid=? and userid<?”,A5,A5+A3)

8

  =B6.close()

9

  result B7

10

=A5.conj()  

Explanation: Divide data into 4 parts according to the value range of userid and set 4 threads; each is assigned one part through the parameters and returns the result to the main thread.

Note that only when sufficient database resource is available will the parallel processing increase performance considerably. If the database is already fully loaded, the parallel data retrieval will aggravate its burden instead.

Order-related computations

Order related computations include inter-row comparison, getting relative interval and computing ranks. As a SQL result set doesn’t have explicit sequence numbers, you must generate them through a computed column. This is quite a hassle. For databases that lack sufficient support for window functions, such as MySQL, the code will be complicated.

One of the intrinsic qualities of esProc is its support for order-related computations, making the implementation easier. Let’s look at two cases – computing link relative ratio and making a year-on-year comparison.

The database table sOrder stores orders per day from each seller. The report needs to present results of comparing the sales amount per month during a specified time period with that of the previous month and with that of the same month in the previous year. Below is the source data:

esProc_report_21

Script for computing link relative ratio:

 

A

1

=myDB1.query(“select * from sOrder where OrderDate>=? and OrderDate<=?”,begin,end)

2

=A1.groups(year(OrderDate):y,month(OrderDate):m;sum(Amount):mAmount)

3

=A2.derive(mAmount/mAmount[-1]:lrr)

Explanation: First group and aggregate data to get sales amount per month per year, and then compute link relative ratio using the formula “sales amount of the current month/sales amount of the previous month”. The result is as follows:

esProc_report_23

Script for comparing sales amount between same months on an annual basis:

 

A

1

=myDB1.query(“select * from sOrder where OrderDate>=? and OrderDate<=?”,begin,end)

2

=A1.groups(year(OrderDate):y,month(OrderDate):m;sum(Amount):mAmount)

3

=A2.sort(m)

4

=A3.derive(if(m==m[-1],mAmount/mAmount[-1],null):yoy)

Explanation: Group data and sort it by months, and then perform the computation with formula “sales amount of the current month/sales amount of the same month in the previous year. The result is as follows:

esProc_report_25

A similar example is computing inventory over the row.

The database table inout stores warehouse-in and -out records per product per day during consecutive days and the initial inventory is zero. You need to display the data in a report and add an extra column to represent stock per day.

esProc_report_26

esProc script:

 

A

1

=myDB1.query(“select *,0 as stock from inout order by date”)

2

>A1.group(product).run(~.run(in-out+stock[-1]:stock))

3

result A1

Explanation: Compute current day’s stock for each kind of product (through data grouping) with the formula “current day’s warehouse-in – current day’s warehouse-out + warehouse stock of the previous day”. esProc uses [-1] to represent the previous record and produces intuitive expression. Here is the result:

esProc_report_28

If the dates are discontinuous, you need to align data with a sequence of consecutive dates before computing inventory. The script is as follows:

 

A

1

=myDB1.query(“select * from inout”)

2

=periods(argBeginDate,argEndDate)

3

=A1.group(product;product:p,

~.align(dateList,date).new(p:product,dateList(#):date, ifn(in,0):in, ifn(out,0):out, stock[-1]+in-out:stock))

4

=A3.conj(#3)

Another typical case involving order-related computation is the installment loan.

The loan table stores loan information, including total loan amount, payment terms by month and annual interest rate. You need to build a grouped report in which details of each term of payment for every loan record – payment, interest, principal, principal balance – will be presented. Below is the source data:

esProc_report_30

esProc script:

 

A

1

=myDB1.query(“select * from loan”)

2

=A1.derive(Rate/100/12:mRate,LoanAmt*mRate*power((1+mRate),Term)/(power((1+mRate),Term)-1):mPayment)

3

=A2.((t=LoanAmt,Term.new(A2.LoanID:LoanID, A2.LoanAmt:LoanAmt, A2.mPayment:payment, A2.Term:Term,

A2.Rate:Rate, t*A2.mRate:interest, payment-interest:principal, t=t-principal:principlebalance)))

4

=A3.conj()

5

result A4

Explanation: First calculate the monthly interest rate and each term’s payment for every loan using a formula; then create a two-dimensional detail table for each loan according to the available information and calculate the current term’s interest, principal and principal balance; finally concatenate these detail tables. Below is the detail table for loan 1:

esProc_report_32

Post-grouping subset handling

Some cases, require non-aggregate operations, such as record query, member modification or set operations, after data is grouped, but it’s hard to do this in SQL as a SQL grouping has a built-in aggregate operation. Being able to handle post-grouping operations and set operations, esProc is better at dealing with those cases.

Compare goal difference

The database table Mytable stores goal differences of each player in each team for every competition. The report needs to present the number of competitions in which a specified player is superior in goal difference compared with every other player in the team. Below is the source data:

esProc_report_33

esProc script:

 

A

1

=myDB1.query(“SELECT S1.Player p1, S1.Goals g1,S2.Player p2,S2.Goals g2

FROM Mytable S1

INNER JOIN Mytable S2 ON S1.Team = S2.Team AND

S1.Game= S2.Game AND

S1.Player =?

AND S2.Player <> ?”,argPlayer)

2

=A1.group(p2;~.select(g1>g2).count():high)

Explanation: Use a SQL statement to query the goal difference of the specified player and that of each of his teammates in each competition, and then group data by his teammates and query from each group (set) the competitions (subset) in which the specified player scores more goals than his teammate. You can get the result by counting members of a subset. Here is the result:

esProc_report_35

Tag members of a group

In the database table test, classLev1 and classLev2 are grouping fields of two levels and seqInlev2 represents intragroup sequence numbers. You need to create a list to present data and add a tag field to tag rows in a group. Rules for assigning values to tag field are: Assign “first” to the first row and “last” to the last row, and assign “middle” to those in between; if there is only one row in a group, assign null to it.

Below is the source data:

esProc_report_36

esProc script:

 

A

1

$select *,null tag from t1 order by classLev1,classLev2,seqInLev2

2

=A1.group(classLev1,classLev2).select(~.len()>1).run(~.(tag=”middle”),~(1).tag=”first”,~.m(-1).tag=”last”)

3

result A1

Explanation: Retrieve data with a SQL statement and create an empty field tag; group data and select groups in which there is more than one member; tag all rows with “middle” and then change the first row to “first” and the last row to “last” in each group. Here is the result:

esProc_report_38

Query branch stores where DVDs are in short supply

The Branch table stores information of branch stores selling DVDs; the DVD table stores titles of DVDs and categories which they belong to; DVDCopy table stores information about DVD copies (the real products), and associates with the Branch table through BranchID field and with DVD table through DVDID field. That the Status field in DVDCopy table is “Miss” means this copy is missing; if LastDateReturned field is empty, the copy has been borrowed but hasn’t returned yet. The report needs to present information of branch stores where the available DVD copies have fewer than 4 categories. Below is the source data:

Branch table:

esProc_report_39

DVD table:

esProc_report_40

DVDCopy table:

esProc_report_41

esProc script:

 

A

1

=Branch=db.query(“select * from Branch”)

2

=DVD=db.query(“select * from DVD”)

3

=DVDCopy=db.query(“select * from DVDCopy”)

4

=DVDCopy.switch(DVDID,DVD; BID,Branch)

5

=DVDCopy.select(Status!=”Miss” && LastDateReturned!=null)

6

=A5.group(BID)

7

=A6.new(~.BID:BonList, ~.(DVDID).id(Category).count():CatCount)

8

=A7.select(CatCount<4)

9

=A8.(BonList) | (Branch \ A7.(BonList))

Explanation: Select available DVD copies possessed by branch stores from DVDCopy table and group them by BID to get copies of each branch store; find DVDs corresponding to the DVD copies each branch store has and calculate how many categories where they fall in; find branch stores that have fewer than 4 categories of available DVD copies and that don’t exist in the DVDCopy table (both types are eligible); then combine data of the two types stores. The result is as follows:

esProc_report_43

Fixed grouping

Sometimes the grouping criteria come from other database tables or files, or are defined by external parameters, and their number is large and there is big difference between each other; other times the grouping criteria are dynamic and overlap each other. The problem is that SQL grouping criterion must be some field in the current database table, so it’s difficult to handle this kind of fixed grouping in SQL. esProc can easily implement this algorithm because it supports alignment grouping, enumeration grouping and the flexible equal grouping.

Group data by a parameter list

The sales table stores ordering records, in which CLIENT filed holds client names and AMOUNT field holds order amount. The report requires that CLIENT be grouped by the parameter list argClient and that AMOUNT be aggregated to calculate sums. Below is the source data:

esProc_report_44

If the parameter list has relatively small number of members, you can compose CLIENT into a dummy table with SQL union statement. If it has a large number of members, you can store data permanently in a database table before composing. This problem, however, has a dynamic, external parameter list with unknown number of members and containing members that even don’t exist in CLIENT field. To solve it in SQL, you must create a temporary table. It’s unnecessary to create such a temporary table in esProc. The following is esProc code:

 

A

1

=sales=db.query (“select * from sales”)

3

=sales.align@a(argClient,Client)

4

=pjoin(argClient,A2.(~.sum(Amount)))

Explanation: Group sales with align function by the parameter list, calculate sum for each group and join the result with the parameter list. Here is the result:

esProc_report_46

Group data by overlapped criteria

Suppose you want to group and aggregate the sales table according to criteria that are probably overlapped, such as the order amount is between 1,000 and 4,000, less than 2,000 and greater than 3,000. You cannot implement the algorithm in SQL without using a great many of union/with/minus statements, and this really is difficult when the criteria are specified through a parameter. In contrast, esProc does this with just the following code:

 

A

1

=sales=db.query (“select * from sales”)

2

=byFac=[“?<=1000″ ,”?>1000 && ?<=2000″,”?>2000 && ?<=4000″,”?>4000″]

3

=sales.enum@r(byFac,Amount)

4

=pjoin(byFac,A3.(~.sum(Amount)))

Explanation: enum function can perform grouping operation with grouping criteria specified by an external parameter. @r allows duplicate members. Here is the result:

esProc_report_48

Cross-tab reports with rows and columns presented in intervals

You might want to count employees with a cross-tab report. Columns are grouped by age ranges and rows are grouped by salary ranges. For instance, 5 columns –  1-30,31-32,33-35,36-40,41-50, and 3 rows – 1-5000,5001-10000,10001-20000 – would be displayed if you type in [30,32,35,40,50] and [5000,10000,20000] respectively. The following is source data:

esProc_report_49

esProc script:

 

A

1

=myDB1.query(“select EId,Birthday,Salary from emp”)

2

=A1.groups(argAge.pseg(age(Birthday)):ageCol,argSalary.pseg(Salary):salaryRow;count(EId):count)

3

=A2.run( concat(ifn(argAge.m(ageCol),0),”-“,argAge.m(ageCol+1)):ageCol,

       concat(ifn(argSalary.m(salaryRow),0),”-“,argSalary.m(salaryRow+1)):salaryRow)

Explanation: Group A1 by parameter intervals argAge and argSalary, calculate the number of employees in each group, and finally change the column and row names to a format permitted by the report. You can also calculate ages with SQL, but the esProc age function can prevent SQL code from being altered during database migration. pseg function will return a member’s sequence number in the interval. This is the result you’ll get:

esProc_report_51

Post-grouping order-related computations

Performing order-related computations based on grouped data makes code writing in SQL even harder. But usually esProc handles it easily.

Calculate the remaining class periods

Every student has the same 10-day long class periods. The database table Course stores each student’s attendance information during these days. The report needs to present the classes that are already attended and the remaining class periods for each student each month. Below is the source data:

esProc_report_52

esProc script:

 

A

1

$select * from tb order by Name,Date

2

=A1.group(Name,month(Date):Month;~.count():Attended)

3

=A2.derive((if(Name==Name[-1],Remaining[-1],10)-Attended):Remaining)

Explanation: Perform group and aggregate operations to calculate the number of attended classes, and add a computed column to get the remaining number of class periods. Here is the result:

esProc_report_54

Get top N members

The database table sOrder stores orders per day from each seller. The report needs to present each seller’s N biggest orders. N is a parameter. Below is the source data:

esProc_report_55

esProc script:

 

A

1

=myDB1.query(“select * from sOrder”)

2

=A1.group(SellerId)

3

=A2.union(~.top(argN,-Amount))

Explanation: Group data by SellerID; get N records that have the biggest Amount values from each group with top function and concatenate them with union function. -Amount means getting records from large to small values. Here is the result:

esProc_report_57

Use A2.(~.maxp@a(Amount)) to retrieve the record, or the records holding the greatest value from each group. To retrieve the first record holding the greatest value from each group, sort the records by Amount in SQL and use A1.group@1(SellerId). Use A2.group(SellerId;~.to(argM,argN)).conj(#2) to get records from the Mth to the Nth from each group. To compare the Amount field values between the first record and the second record in a group, use A1.group(SellerId;~(1).Amount-~.m(2).Amount). The m function can get members by their sequence numbers and judge where a group ends (when a group has only one record).

Modify data to make even value for each group

Suppose you want to create a grouped report that gets data from the database table stock, which makes Title as the grouping field and has detail fields ID and Number. But you need to modify the smallest Number value in each group to ensure that the sum of Number values is 800. Below is the source data:

esProc_report_58

esProc script:

 

A

1

$SELECT ID,Title,Number FROM STOCK ORDER BY Title,Number

2

>A1.group(Title).run(~(1).Number=(800-~.sum(Number)+~(1).Number))

3

result A1

Explanation: Sort data by Title and Number and then group data by Title; loop through each group to modify the first Number value so that the sum of Number is 800. Here is the result:

esProc_report_60

String splitting and concatenation

Sometimes you need to handle structured data processing based on strings generated from splitting a larger string. Other times you need to split one row into multiple rows or combine multiple rows into one. It’s difficult to implement these calculations in SQL. esProc, however, is better at handling them thanks to its support for explicit sets.

Split one row into multiple rows

The database table data has two fields. You want to split one of the fields – ANOMOALIES – into multiple strings and combine each string with the corresponding ID value in the original table to generate new records. Below is the source data:

esProc_report_61

esProc script:

 

A

1

=myDB1.query(“select ID,ANOMALIES from data”)

2

=A1.conj(ANOMALIES.array(” “).new(A1.ID:ID,~:ANOMALIES))

Explanation: Split ANOMOALIES field of each record into a sequence according to spaces, create two-dimensional tables according to the sequences respectively and concatenate these two-dimensional tables. This is the result you’ll get:

esProc_report_63

Concatenate members by grouping fields

In the database table table1, the first three fields are grouping fields and the fourth one is the detail field. You need to build a report where values of the fourth field are combined according to the grouping fields. Below is the source data:

esProc_report_64

esProc script: myDB1.query(“select * from table1”).group(Col1,Col2,Col3;~.(Col4).string():Col4)

Explanation: The aim of this case is to implement group_concat. But not every database supports this function. You can use esProc to achieve the same target, regardless of which database you are using. The following is the result:

esProc_report_65

Date handling

A sequence containing datetime data is more complicated than the one consisting of natural numbers. Thus you may found it’s very difficult to handle dates in SQL, such as getting work days, intersection of some dates and combining time intervals into a larger one. Equipped with built-in date functions and ordered sets, esProc is convenient to use in date handling.

Calculate work days

The database table project stores project information including beginning and ending dates. The report needs to present the information and an extra column for displaying the number of work days. Below is the source data:

esProc_report_66

esProc script:

 

A

1

=myDB1.query(“pid,pname,manager,beginDate,endDate from project”)

2

=A1.derive(workdays(beginDate,endDate).count():newField)

Explanation: With workdays function, it’s easy to calculate the number of work days during a certain time period. Here is the result:

esProc_report_68

Calculate peak travel dates

The database table trip stores business trip information. You need to get the 10 busiest dates and sort them by the number of trips. Below is the source data:

esProc_report_69

esProc script:

 

A

1

=myDB1.query(“select start_Date,end_Date from trip”)

2

=A1.conj(periods(start_date,end_date)).groups(~:d;count(~):c)

3

=A2.sort(-c).to(10)

Explanation: Split each trip into a sequence of dates according to the beginning and ending dates, concatenate the sequences and perform group and aggregate by dates; and then sort the records to get the first 10 ones. Here is the result:

esProc_report_71

Merge overlapped time intervals

In the database table project, beginDate and endDate represent the beginning date and ending date of each project. The report needs to present the time intervals with or without projects (each is designated by “busy” and “free”) in sequence. Below is the source data:

esProc_report_72

esProc script:

 

A

1

=myDB1.query(“pname,manager,beginDate,endDate from project”)

2

=A1.sort(beginDate).select(endDate>endDate[-1])

3

=A2.run(if(beginDate<endDate[-1],beginDate[-1],beginDate):beginDate)

4

=A3.group@o(beginDate;~.m(-1).endDate:endDate,”busy”:type)

5

=A4.new(beginDate[-1]:beginDate,endDate:endDate,”free”:type).to(2,)

6

=(A4|A5).sort(beginDate)

Explanation: A2-A4 merges smaller time intervals into bigger ones. This is done by first removing dates within larger intervals. That is, if the ending date of a time interval is after the beginning date of the next interval, then merge them as a bigger interval from the beginning date of the first interval. After that, group data by dates and use the ending date of the last record in each group as the bigger interval’s ending date. A5 calculates free time intervals. Finally, concatenate the busy and free time intervals and perform sort. Here is the result:

esProc_report_74

FAVOR (0)
Leave a Reply
Cancel
Icon

Hi,You need to fill in the Username and Email!

  • Username (*)
  • Email (*)
  • Website