To deal with a large text file that can’t be wholly held in the memory, you need to import it in segments and process each segment separately. It’s knotty. Sometimes even multithreaded parallel processing is needed so as to increase performance. But since most of the programming languages don’t support basic class libr...
2016-03-23 1642 0
0
JSON format multilevel semi-structured data is commonly seen in internet applications. Java provides just the class library for parsing JSON data, but to perform in-depth calculations, complex hardcoding is required.
esProc supports set-operations, order-related calculations and dynamic script execution, so it can be u...
2016-03-11 1242 0
0
The group operations performed on tabular data generated from text files include algorithms like grouping and aggregation, obtaining distinct values, group merging and so on, which can be realized using basic JAVA class libraries. But JAVA provides only limited support for the structured-data computing, generating comp...
2015-12-11 1382 0
0
It’s hard to develop code for performing file comparisons – including finding common values or modified records, comparison of big files or multiple fields or files with different structures, and other scenarios, because generally they involve set operations, structured-data handling and multithreaded parallel pr...
2015-12-11 1249 0
0
Having complex formats and unstandardized data, many of the text files are incomputable. They, when used as the data source, need preprocessing to be converted to the structured data or the database table for further query or statistics. Though we can perform this conversion using high-level languages like JAVA, or scr...
2015-11-09 1546 0
0
The group operations performed on tabular data generated from text files include algorithms like group and aggregate, obtaining distinct values, group merging and so on, which can be realized through high-level languages like JAVA or scripting languages like Python. But these two types of languages provide only limited...
2015-11-07 1255 0
0
You can handle simple file comparisons with the console command, Java, python and perl. But all of them are not good at performing set operations and structured computations. This will result in complicated code for multi-threaded processing and cumbersome process in comparing multiple fields, big files and the files w...
2015-08-20 1533 0
0
Encapsulated lots of functions for handling structured file computing, esProc can import text files with complex formats, carry out cursor-style processing with big files and simplify multithreaded parallel processing. Usually there are three modes in which esProc can be applied: a standalone mode, the execution from c...
2015-08-19 1820 0
0
The usual way to insert summary values into the grouped data is to process data group by group. Import a group of data, append them and their summary value to a new file and then do the same with the next group, and so on. But it is not easy to realize this in hard coding. esProc, however, supports group cursor with wh...
2015-01-05 1116 0
0
Sometimes we need to fetch certain data from multiple files of a multi-level directory during text processing. The operation is too complicated to be well performed at the command line. Though it can be realized in high-level languages, the code is difficult to write; and the involvement of big files will increase the ...
2014-12-22 1147 0
0