To deal with a large text file that can’t be wholly held in the memory, you need to import it in segments and process each segment separately. It’s knotty. Sometimes even multithreaded parallel processing is needed so as to increase performance. But since most of the programming languages don’t support basic class libr...
2016-03-23 45 0
SQL is a sophisticated and all-around database programming language, making most instances of structured-data computing a painless experience. Yet there are still some instances that are difficult to handle in SQL in computer programming.
Here’s an example. duty is a MySQL table of shift schedule, in which an employee ...
2016-03-16 26 0
There are many different types of report data sources, including relational databases, NoSQL databases, local files, HDFS files and JSON data stream. It’s easy to build a report with a single data source, but it’s difficult to build one that needs data from more than one type of data source, i.e. heterogeneous data sou...
2016-03-15 20 0
JSON format multilevel semi-structured data is commonly seen in internet applications. Java provides just the class library for parsing JSON data, but to perform in-depth calculations, complex hardcoding is required.
esProc supports set-operations, order-related calculations and dynamic script execution, so it can be u...
2016-03-11 22 0
A reporting architecture consists of three layers from bottom to top – storage layer, computing layer and displaying layer. The storage layer contains raw data, which may be stored in a relational database (RDB), a NoSQL database, and a local or HDFS file, or may just be a JSON stream. The computing layer can access th...
2016-03-07 18 0
This article aims to test performance of esProc in processing text files, using an example of data query and filtering and through the comparison with Java and Perl doing the same processing.
Test data is some order records stored in orders.txt file. The imported data is as follows:
ORDERID CLIENT &nbs...
2016-02-15 29 0
It’s tedious and cumbersome to express dynamic columns in SQL, so programmers usually turn to high-level languages like JAVA to compose the dynamic statement. The problem is JAVA basic class libraries don’t include set operations, causing equal amount of difficulty for those trying to do it automatically.
2015-12-15 25 0
Sometimes you need to transpose a database table (or a text file) in JAVA before exporting it out. Different types of transposition require different SQL techniques, and at times you have to do low-level programming in JAVA. That is quite difficult.
As esProc that serves as JAVA class library supports dynamic scripting...
2015-12-14 68 0
The group operations performed on tabular data generated from text files include algorithms like grouping and aggregation, obtaining distinct values, group merging and so on, which can be realized using basic JAVA class libraries. But JAVA provides only limited support for the structured-data computing, generating comp...
2015-12-11 22 0
It’s hard to develop code for performing file comparisons – including finding common values or modified records, comparison of big files or multiple fields or files with different structures, and other scenarios, because generally they involve set operations, structured-data handling and multithreaded parallel pr...
2015-12-11 20 0
Having complex formats and unstandardized data, many of the text files are incomputable. They, when used as the data source, need preprocessing to be converted to the structured data or the database table for further query or statistics. Though we can perform this conversion using high-level languages like JAVA, or scr...
2015-11-09 45 0
The group operations performed on tabular data generated from text files include algorithms like group and aggregate, obtaining distinct values, group merging and so on, which can be realized through high-level languages like JAVA or scripting languages like Python. But these two types of languages provide only limited...
2015-11-07 18 0
In esProc, constants can be directly stored in cells and be referenced in expressions. For the basic usage of constants, please refer to esProc Getting Started: Constants. Actually, when a constant stored in a cell is referenced, it means the value of the cell is used as a variable. The parameters and variables in esPr...
2015-10-28 16 0
When you work with the spreadsheet data, often you need to combine data from multiple sheets. esCalc provides several operations, including join and union, to do this.
1. Join operation
With esCalc, you can perform the join operation to import data from one esCalc sheet into the other according to the primary ke...
2015-10-27 22 0
During data analysis and computing with spreadsheets, you often need to group data according to the computing target. For example, group order records by clients or the manufacturing data by months. In an esCalc spreadsheet, you can use the group operation to divide bands into different groups according to values of ho...
2015-10-23 19 0