esProc Finds Differences between CSV files

Uncategorized 892 0

userName and date are the logical primary key of both old.csv file and new.csv file, in which we want to find rows that are new, deleted and updated.

The source data is as follows:

esProc_text_difference_csv_1

As can be seen from the above data, in new.csv the 2nd and the 3rd row are the new and the 4th row is the updated; in old.csv the 3rd row is the deleted.

esProc code:

        

A

B

1

=file(“d:\\old.csv”).import@t(;”,”) =file(“d:\\new.csv”).import@t(;”,”)

2

=A1.sort(userName,date) =B1.sort(userName,date)

3

=new=[B2,A2].merge@d(userName,date)  

4

=delete=[A2,B2].merge@d(userName,date)  

5

=diff=[B2,A2].merge@d(userName,date,saleValue,saleCount)

6

=update=[allDiff,new].merge@d(userName,date) result update

A1,B1:Retrieve the comma-separated files.

A2,B2:Sort data by the key, as this is required by the following merge function.

A3:Find the new records by the key. merge function is used to merge data sets. @d means calculating the difference during the merge. Similar options include @u for union and @i for intersection. The computed result is as follows:

esProc_text_difference_csv_3

A4:Find the deleted records by the key. The computed result is as follows:

esProc_text_difference_csv_4

A5:Take the key fields as ordinary ones to find the updated records. The computed result is as follows:

esProc_text_difference_csv_5

A6:A5 is an intermediate result. We need to calculate the difference between A5 and the new records to get the updated records. The computed result is as follows:

esProc_text_difference_csv_6

B6:Return A6 to JAVA program or the reporting tool. 

Now all data processing work has been finished. We’ll then integrate the esProc script into JAVA program via JDBC. The JAVA code is as follows:

         //establish a connection via esProc JDBC

         Class.forName(“com.esproc.jdbc.InternalDriver”);

         con= DriverManager.getConnection(“jdbc:esproc:local://”);

         //call esProc script, whose name is test and that can accept parameters

         st =(com.esproc.jdbc.InternalCStatement)con.prepareCall(“call test()”);

         st.execute();//execute esProc stored procedure

         ResultSet set = st.getResultSet();//get the result 

If you want to return multiple data sets to the JAVA program, you can modify B6’s code into result new,delete,update.

FAVOR (0)
Leave a Reply
Cancel
Icon

Hi,You need to fill in the Username and Email!

  • Username (*)
  • Email (*)
  • Website