How to Use esProc as the Class Library of Structured File Computing for Java

Blog 1762 0

In some cases, data must be stored in the file system, rather than in a database. That requires handling file-based data computing manually. Since Java lacks the related class library, you have to hardcode the structured file computing, which produces complicated and unreadable code.

Yet, you can make up for this deficiency with esProc (free edition is available). esProc encapsulates plenty of functions for processing structured files and provides the JDBC interface. A Java application will treat an esProc script as a database stored procedure, execute it after passing parameter and get the result set via JDBC. You can learn more details from How to Use esProc as the Class Library for Java.

Now let’s look at some familiar scenarios involving structured computing in Java and their esProc solutions. 

Querying text file

sOrder.txt is a text file separated by tabs. You need to query orders in a specified time period in Java.

Below is the source data:

esProc_java_class_library_2

esProc code:

        

A

1

=file(“D:\\sOrder.txt”).import@t()

2

=A1.select(OrderDate>=startDate && OrderDate<=endDate)

Import the file, using tab as the default separator. @t means importing the first row as column names. Then perform the conditional filtering. startDate and endDate are input parameters, like a period from 2010-01-01 to 2010-12-31. Here’s the result:

esProc_java_class_library_4

Sorting text file

Based on the above sOrder.txt, you need to sort records in Java by client numbers in descending order and by year and month in ascending order.

esProc code: =A1.sort(-Client,year(OrderDate),month(OrderDate))

Explanation: Use “-” to sort data in descending order. You need to do some calculation to get the years and the months.

Related information: To perform sort after the querying, you can use =A2.sort(…), or =A1.select(…).sort(…) .

Here’s the result:

esProc_java_class_library_5

Grouping and aggregations

You need to calculate the sales amount and number of orders per seller per year.

esProc code: =A1.groups(SellerId,year(OrderDate);sum(Amount),count(~))

Explanation: group function can perform aggregate while grouping data. ~ represents each group or the current group. count(~) is equivalent to count(OrderID).

Here’s the result:

esProc_java_class_library_6

Getting distinct value

You need to make a client list according to sOrder.

esProc code: =A1.id(Client)

Here’s the result:

esProc_java_class_library_7

Removing duplicate values

You need to get the first record for each client and for each seller

esPrco code: =A1.group@1(Client,SellerId)

Explanation: group function is used to group records (without having to aggregate them). @1 means getting the first record from each group.

Here’s the result:

esProc_java_class_library_8

Finding topN

You need to get the 3 orders with the greatest sales amount for each seller.

esProc code: =A1.group(SellerId;~.top(3,-Amount):t).conj(t)

Explanation: top function filters records to get TopN. The sign “-” means a reversed order. conj function is used for concatenation.

Related information: To get the order with the greatest sales amount, use maxp function.

Here’s the result:

esProc_java_class_library_9

Related computing

emp.txt is a text file, separated by tabs. Its EId field corresponds to sOrder’s SellerId field. Now with Java, you need to align Name, Dept and Gender fields in emp.txt with sOrder.txt.

Below is the source data:

esProc_java_class_library_11

esProc code:

1

=sOrder=file(“D:\\sOrder.txt”).import@t()

2

=emp=file(“D:\\emp.txt”).import@t(EId,Name,Dept,Gender)

3

=join@1(sOrder:s,SellerId;emp:e,EId)

4

=A3.new(s.OrderID, s.Client, s.SellerId, s.Amount, s.OrderDate,

        e.Name, e.Dept, e.Gender)

The join function performs a join and changes names of the two tables into s and e respectively; @1 means a left join. Then you retrieve the desired fields from the joined table to create a new structured two-dimensional table. Here’s the result:

esProc_java_class_library_12

Related information: @1 means a left join; @f means a full join. No option is needed for an inner join

All the above examples assume that the size of the file is relatively small. If the file is too big to be entirely loaded into memory, you can use the esProc cursor to handle it. See related documents for detailed information. 

FAVOR (0)
Leave a Reply
Cancel
Icon

Hi,You need to fill in the Username and Email!

  • Username (*)
  • Email (*)
  • Website