category:External Memory

esProc External Memory Computing: Virtual Table

esProc External Memory Computing: Merge and Join Cursor Data discussed how to have several big data tables joined and retrieve data from them. But we notice that, though it’s convenient to handle big data with the cursor, there’s the restriction of one-way retrieval from front to back. Hence data sorting becomes the pr...

2015-10-16 978 0 0

esProc External Memory Computing: Binary Files

Two kinds of data file – the normal text file and the binary file – are most used in esProc, of which the binary file adopts compressed encoding of low CPU consumption, meaning that it takes less space than an uncompressed text file, and the data reading efficiency will be higher. Thus, we can conclude that the b...

2015-09-10 1178 0 0

esProc External-Memory Computing: Indexed Files

Indexed Sequences in esProc discusses how to create an index of a table sequence or a text file in esProc to greatly speed up the query on desired fields. But how to deal with a data-intensive file whose index table cannot be loaded into the memory? Here let’s look at how to increase the speed of big data handling by c...

2015-09-09 976 0 0

esProc External Memory Computing: Aggregate Operations with Cursor

The data volume of the big data table is usually quite huge, which makes it impossible to retrieve all data from the big data table at once. In view of this, the data processing over big data table is usually to serve two purposes: With cs.fetch(), retrieve partial data each time or group & aggregate the data in th...

2015-07-29 1023 0 0

esProc External Memory Computing: Merge and Join Cursor Data

During data computing based on the table sequence, we can combine data from multiple table sequences together for use in the analysis and computation. For instance, use A.merge() to combine records of multiple table sequences in a certain order, or A.conj() to union them in order into a grand table, or JOIN functions, ...

2015-07-28 1139 0 0

esProc External Memory Computing: Cross-cellset Cursor

In the article esProc External Memory Computing: Concept of Cursor, we only touched on the basic usage of the cross-cellset cursor. Here we’ll delve into more issues about it. 1. Basic usage The cross-cellset cursor is typically used to handle big data analysis and computing, but it doesn’t impose a minimum limit on da...

2015-02-06 1087 0 0

esProc External Memory Computing: Group Cursor

During big data computing, besides data traversal and grouping and aggregate operations, sometimes we need to retrieve one group of data each time to analyze. For example, analyze sales data by dates, plot sales curve for each product, and study the purchase habit of each client. 1. Fetch data by groups according to th...

2015-01-30 885 0 0

esProc External Memory Computing: Basic Usages of the Cursor

esProc supports importing big data in batches with the cursor, which is the usual method used in big data computing. Usages of cursors, including external file cursor, database cursor and in-memory record sequence cursor, are basically the same. This article will take the external file cursor as the example to explain ...

2015-01-28 1171 0 0

esProc External Memory Computing: Concept of Cursor

The concept of cursor is very important for the database. With the cursor, data can be manipulated more flexibly and returned from the data table by rows. esProc supports many types of cursor, like database cursor, file cursor and in-memory record sequence cursor, to satisfy various needs in data fetching and processin...

2015-01-23 1100 0 0

esProc External Memory Computing: Text Files

Sources of data used for analysis usually fall into two categories: the database source and the file source. Compared with the database data, the file data are simple to deploy and publish. The problem is that, since the file data generally need to be used as a whole and thus need to be loaded into the memory all at on...

2015-01-20 1043 0 0

esProc External Memory Computing: Principle of Sorting

It is common to sort records in the table during data analysis and computing. In esProc, sort function is used to sort data in the sequence or the table sequence. External memory sorting is required when data being sorted are massive and cannot be loaded into memory all together, for the ordinary sorting method cannot ...

2015-01-13 1010 0 0

esProc External Memory Computing : Principle of Grouping

After data are imported from a data table, we often need to group them as required, or work out grouping and summarizing result. In esProc, groups function is used to compute the result of data grouping and summarizing; or group function can be used to first group the data, then perform further analysis and computation...

2015-01-06 1110 0 0