esProc Simplifies SQL-style Computations– Get Top N Rows of Each Group

Uncategorized 718 0

The operation of getting the top N rows of data is common, such as the biggest salary raise of each employee, the three lowest scores in playing golf and the five days of each month when each product has its biggest sales. SQL resorts to advanced techniques, like window functions and keep/top/row number, to solve this kind of problem. Therefore the code is quite difficult. And as many databases (like MYSQL) doesn’t support these advanced techniques, they can only realize this operation using complicated JOIN statement and nested subquery. If multi-layered groups or multi-level relationships are involved, the computation will become even more complicated.

Yet esProc’s top function can select the top N rows of data of each group according to the row numbers as well as the maximum and minimum values, thus the computation will be made simpler. The following example will teach you how esProc works.  

golf, a database table, stores scores members have got in playing golf. Please select the three highest scores each member has had. Some of the data are as follows:

esProc_sql_topn_1

esProc Code:

  A
1 =db.query(“select * from golf”)
2 =A1.group(User_ID)
3 =A2.(~.top(-Score;3))
4 =A3.union()

A1:Select data from the database. Click the cell and you can see the result in detail:

esProc_sql_topn_3

A2:=A1.group(User_ID). The code groups the selected data in A1. Result is as follows:

esProc_sql_topn_4

As shown in the above figure, the selected data have been divided into multiple groups by User_ID, with each row representing a group. Click the hyperlinks in blue and you can see members of each group:

esProc_sql_topn_5

A3:=A2.(~.top(-Score;3)). The code gets the top three records of Score field of each group. Here “~” represents each group of data and ~.top() represents that top function is used to compute each group of data in order. top function can get the top N records of a data set. For example, top(Score;3) means sorting the records in ascending order by Score field and getting the top 3 records (i.e. the three smallest values); top(-Scroe;3) means sorting the records in descending order by Score field and getting the top 3 records (i.e. the three biggest values); and top(#;3) means getting the top three records
according to the original order. The computed result of this step is as follows:

esProc_sql_topn_6

A4:=A3.union(). It performs union operation on each group of data. Result is as follows:

esProc_sql_topn_7

The above four steps can also be combined into one step for the convenience of maintenance and debugging. Thus the code will be db.query(“select * from golf”).group(User_ID). (~.top(-Score;3)).union().

In addition, esProc program can be called by the reporting tool or a Java program in a way similar to that in which a Java program calls an ordinary database. The JDBC provided by esProc can be used to return a computed result in the form of ResultSet to the Java main program. Please refer to related documents for more details.

FAVOR (0)
Leave a Reply
Cancel
Icon

Hi,You need to fill in the Username and Email!

  • Username (*)
  • Email (*)
  • Website