esProc Advanced Coding : Clear Cell Values

In esProc, the storage of cellset variables is ubiquitous. Cell values are convenient references during computation, but they could occupy too much memory. Data can be cleared from cells to reduce memory footprint after they accomplish their mission in computation. It should be particularly noted that, when the intermediate data are obtained and further complicated computations are needed, useless cell values should be deleted to reduce memory usage in order to effectively avoid memory overflow.

1. Set cell values as null and reset a table sequence

In the following task, you are asked to list top 200 transaction records of all household appliances orders and food orders according to total order amount, and sort them by product names. Order records come from two text files: Order_Appliances.txt and Order_Foods.txt. First summarize the data in the two sheets, get top 200 order records by total order amount, and then sort them by product names.

	A
1	=file(“Order_Appliances.txt”).import@t()
2	=file(“Order_Foods.txt”).import@t()
3	=A1\|A2
4	=A3.top(-Amount;200).new(ID, PName,Type,Date,Amount)
5	=A4.sort(PName)

Computed results of all cells are as follows:

The table sequence in A1 contains order records of Order_Appliances.txt:

esProc_syntax_clearvalues_2

The table sequence in A2 contains order records of Order_Foods.txt:

esProc_syntax_clearvalues_3

A3 concatenates records of the two table sequences for the use of filtering in the next step.

A4 first selects top 200 order records according to total sales amount and then selects from them the desired fields to generate a new table sequence. As the data need to be sorted by sales amount in descending order, a negative sign is added before the Amount in top() function and thus you get the results sorted by sales amount:

esProc_syntax_clearvalues_4

A5 sorts the top 200 order records in sales amount by product names as required:

esProc_syntax_clearvalues_5

In fact, what we really need in the final result is the data in A5. After A4 gets all the necessary information of order records, the information in original tables in A1 and A2 becomes useless. Deleting these useless data after getting intermediate data can release memory and make the operation steadier.

Therefore, the cellset program can be modified according to the following method:

	A	B	C
1	=file(“Order_Appliances.txt”).import@t()
2	=file(“Order_Foods.txt”).import@t()
3	=A1\|A2	>A1.reset()	>A2=null
4	=A3.top(-Amount;200).new(ID, PName,Type,Date,Amount)	>A3=null
5	=A4.sort(PName)	>A4=null

If the cell value is set as null, the data in the cell would be deleted, as shown by statements in C3 and B4. After statement in C3 clears the cell value from A2 and B4 clears the references of records of food orders from A3, the original table sequences of food orders will be removed from memory.

While T.reset() function in B3, which is a little different, will delete all records from the table sequence but will retain its data structure. After B3 is executed, value of A1 is as follows:

esProc_syntax_clearvalues_7

You can choose the method for clearing cell values as needed. Setting cell values as null is more commonly used. T.reset() is used only when the table sequence’s data structure is worth retaining.

Note that though the statement in B5 sets the values of A4 as null, it cannot reduce memory footprint. Because the result A5 returns is a record sequence in which the record references come from the table sequence in A4, these records cannot be deleted and will be still referenced in A5 even if A4 is set as null. Therefore, when the method of setting cell values as null is to be used, you must find out whether data in the original cells are being in use or not.

In addition, A5 sorts records in A4, but the execution won’t produce new records. What A5 stores is merely the references resulted from sorting records, which have a limited memory footprint and won’t increase memory usage.

2. Clear cell values with clear statement

After a piece of code is executed, probably there is a batch of cells whose values need to be cleared and only the final result is deserved to be retained, like the following cellset shows:

	A	B
1	=file(“Order_Appliances.txt”).import@t()
2	=file(“Order_Foods.txt”).import@t()	=A1\|A2
3	=B2.top(-Amount;200).new(ID, PName,Type,Date,Amount)
4	=A3.sort(PName)
5	>A1=null	>A2=null
6	>B2=null	>A3=null

Similar to the previous example, A4 gets the top 200 records of orders according to the total amounts and sorts them by product names. In order to reduce memory usage after computing, A5, B5, A6 and B6 clear values from the useless cells. Obviously the code for accomplishing this is a little tedious. In esProc, clear command is specially provided to do this job:

	A	B
1	=file(“Order_Appliances.txt”).import@t()
2	=file(“Order_Foods.txt”).import@t()	=A1\|A2
3	=B2.top(-Amount;200).new(ID, PName,Type,Date,Amount)
4	=A3.sort(PName)
5	clear A1,A2,B2,A3

The effect of executing this cellset is the same as the above. The clear command in A5 can remove values from multiple cells at a time. In addition, the clear command can delete cell values by specifying an area of cells. For example:

	A	B
1	=file(“Order_Appliances.txt”).import@t()
2	=file(“Order_Foods.txt”).import@t()	=A1\|A2
3	=B2.top(-Amount;200).new(ID, PName,Type,Date,Amount)
4	=A3.sort(PName)
5	clear A1:B3

In the clear command in A5, A1:B3 represents a rectangular area from A1 to B3, including A1, B1, A2, B2, A3 and B3. The clear command will clear all these cells of values. You are allowed to write the command using single cells and the an area of cells together, for example, you can write the clear command in A5 as clear A1:B2,A3, and get the same result.

Another cellset:

	A	B
1	[Order_Appliances.txt,Order_Foods.txt]	=[0]*A1.len()
2	for A1	=file(A2).import@t()
3		=B2.top(-Amount;100)
4		=B3.sort(PName)
5		>B1(#A2)=B4
6	=B1.merge(PName)
7	=A6.new(ID, PName,Type,Date,Amount)

The operation is a little different. It imports data from the two files according to the file names in A1, selects the top 100 records from each file according to order amounts, sorts them respectively by product names and stores the results in B1 respectively as a member of the sequence. As the results have been sorted, they will be merged in A6 by product names. The final result of table sequence can be seen in A7:

esProc_syntax_clearvalues_12

In this piece of code, the useless cell values also need to be deleted after A7 computes the final result. Besides by specifying cells or a cell area, you can also clear cell values by specifying a master cell, as shown in the following cellset:

	A	B
1	[Order_Appliances.txt,Order_Foods.txt]	=[0]*A1.len()
2	for A1	=file(A2).import@t()
3		=B2.top(-Amount;100)
4		=B3.sort(PName)
5		>B1(#A2)=B4
6	=B1.merge(PName)
7	=A6.new(ID, PName,Type,Date,Amount)	clear A1:B1,A2:,A6

In B7’s clear command, A2:’s writing style is similar to a cell area’s, but it doesn’t specify the end cell. This means clearing values from all the cells in a code block whose master cell is A2; that is, deleting values from the cells from the second to fifth line. After the code is executed, all the cell values are cleared except the result of A7.

1. Set cell values as null and reset a table sequence

2. Clear cell values with clear statement

Hi，You need to fill in the Username and Email！