With a spreadsheet, often we need to re-sort data we work with to make it more organized according to the computing target. An esCalc sort operation can sort data efficiently. At times the desired order is not a simple ascending or descending order. In those cases the align operation can be used to sort data in a speci...
2015-10-20 29 0
esProc External Memory Computing: Merge and Join Cursor Data discussed how to have several big data tables joined and retrieve data from them. But we notice that, though it’s convenient to handle big data with the cursor, there’s the restriction of one-way retrieval from front to back. Hence data sorting becomes the pr...
2015-10-16 18 0
Through esProc Parallel Computing: Data Redundancy, we know that for cluster computation data files can be stored in separate servers, with data redundancy solution adopted. Thus the parallel program will find the appropriate node for a subtask according to where the needed files are stored. The fact is, during real wo...
2015-10-13 21 0
Locating desired data and filtering data to delete or hide the unwanted data are common data manipulation operations on a worksheet. esCalc offers locate operation to find the homo-cells satisfying your criteria in a sheet and filter operation to filter away the homo-bands that don’t satisfy them. Here we’ll focus our ...
2015-10-12 18 0
Data in an esCalc worksheet is stored in hierarchical bands, allowing it to be analyzed through various operations such as grouping, filtering and sorting. This article deals with row expansion. The expand operation allows you to make copies of a record based on a sequence-type cell value. It is used to complete data e...
2015-10-10 16 0
esProc Parallel Computing: Cluster Computing explains how to use cluster computing to process massive data or handle tasks involving a large amount of computation.
1. The problem with parallel computing
Parallel computing requires that the dfx file to be invoked be stored in all nodes, and that the data files the subta...
2015-10-09 21 0
In SQL, usually we can only group a table automatically according to its own filed(s). When the grouping criterion comes from another table, or is an external parameter or a conditional list, SQL has to handle the grouping in a very roundabout way. Some cases even require the dynamic criteria, which need to be generate...
2015-09-29 20 0
Usually SQL is merely able to sort data by one or more certain fields. When it comes to sorting by a list, the only choice is to use decode or union. But with a long list, the SQL statement will be lengthy. If the items of the list are parameters representing unfixed values, usually a temporary table needs to be create...
2015-09-25 20 0
It is convenient to establish and close a connection to MongoDB in esProc , as well as to call the database to query and count the data, perform distinct and aggregate operations.
1. Preparation for connecting to MongoDB
The MongoDB Java driver (like mongo-java-driver-2.12.2.jar), which esProc does’nt provide, should b...
2015-09-18 16 0
Below is a selection of Collection C1：
You need to group the collection by name. Each group contains the users field of the document corresponding to a same name and does not allow duplicate members. The expected result may like this:
2015-09-16 32 0
In this article, we’ll test the performance of esProc in handling in-memory small data computing, and compare with that of Oracle when performing the same computation.
The test involves two cases: normal simple computing and complicated related computing:
The test data used in normal computing is order information, as ...
2015-09-14 21 0
Parallel computing allows executing a computation on big data by distributing it to nodes. If the data distributed to each node is still in large volume, it can be returned from the node with the remote cursor. We’ll now learn the usage of the remote cursor as well as its features.
1. The usage
The servers the remote c...
2015-09-11 28 0
Two kinds of data file – the normal text file and the binary file – are most used in esProc, of which the binary file adopts compressed encoding of low CPU consumption, meaning that it takes less space than an uncompressed text file, and the data reading efficiency will be higher. Thus, we can conclude that the b...
2015-09-10 21 0
Indexed Sequences in esProc discusses how to create an index of a table sequence or a text file in esProc to greatly speed up the query on desired fields. But how to deal with a data-intensive file whose index table cannot be loaded into the memory? Here let’s look at how to increase the speed of big data handling by c...
2015-09-09 14 0
esCalc Spreadsheet Editing: Bands expounded the editing on the structure of bands in an esCalc table. Once a band’s structure is decided, you can add, delete or move records and fill them with data.
1. Bands and records
In esCalc, a record is defined as a band that can have homo-bands of the same level and with the sam...
2015-09-06 20 0