Round-off Error Control with esProc

Blog 2338 0

Numerical computations in data reporting are often done with some degree of accuracy by rounding off numerical values to their nearest numbers, which is one type of the rounding operations. Yet this can result in data inconsistency. And in that case, the rounded numbers need to be adjusted to restore the consistency. This adjustment procedure is known as the round-off error control. Here let’s look at how to handle the round-off errors in esProc.

Rounding to nearest number causes an error. Detailed data in the report may not agree with totals because of independent rounding to a certain degree of accuracy and the accumulation of round-off errors. For example, the original equivalence relation 4.5+4.5=9.0 may become 5+5=9 after each number is rounded to the nearest integer. In this case the rounded results of the detailed data should be tuned to restore the equivalence with the sum unchanged, like 4+5=9 – which is a way of dealing with this round-off error.

1. Linear round-off error control

A simple case of round-off error control in statistics is tuning values only in the detailed data level according to the total error if they are used for one total. For example:

  A B C
1 [1.48,0,1.42,0.32,6.48,0.98,1.39] =A1.sum()  
2 =A1.(round(~)) =round(B1) =A2.sum()

A1’s sequence stores certain numbers, whose sum is calculated in B1, as shown below:

esProc_round_off_control_1

Then a rounding operation is performed for statistics. Each number of the sequence is rounded to an integer with the round function to get a new sequence in A2. And the sum obtained in B1 is rounded in B2 to an integer, which is the expected total for the rounded numbers. C2 calculates the sum of the rounded numbers. Here’re results of A2, B2 and C2:

esProc_round_off_control_2

Now the inconsistency arises because the sum of the rounded numbers is 10, instead of 12. Adjusting the rounded sum to 10 is unviable because the final result will be wrong. In order to retain the original consistency, it is the rounded results of the original numbers that need to be adjusted.

The difference between the rounded total value of the unrounded numbers and the total value of the rounded numbers is called round-off error. The round-off error control is to eliminate the error. There are many methods of doing this, which will be discussed in the following.

(1) Distribute round-off error to the first number. For example:

  A B C
1 [1.48,0,1.42,0.32,6.48,0.98,1.39] =A1.sum()  
2 =A1.(round(~)) >A2(1)+=round(B1)-A2.sum() =A2.sum()

B2 distributes the error to the first rounded number. Then C2 calculate the sum again. Here’re results of A2 and C2:

esProc_round_off_control_3

This is the simplest approach to handling round-off error. By doing so, the first number 1.48 becomes 3. But this isn’t as reasonable as it seems. When there are many numbers, the round-off error is relatively big and the method will push the first number too far away from its origin.

(2) Distribute round-off error among the numbers according to absolute values in descending order using the base adjustment value.

The base adjustment value is the unit of the lowest degree of accuracy for the rounded numbers. It is either positive or negative depending on the sign of the error. For the one-digit accuracy integers after rounding, if the sum decreases, the rounded numbers need to be increased by 1; if the sum increases, the rounded numbers need to be decreased by -1.

With this method, we can resolve the inconsistency as follows:

  A B C
1 [1.48,0,1.42,0.32,6.48,0.98,1.39] =A1.sum()  
2 =A1.(round(~)) =round(B1)-A2.sum() =sign(B2)
3 =A1.psort@z(abs(~)) >abs(B2).run(A2(A3(#))+=C2) =A2.sum()

C2 finds the base adjustment value. A3 gets a sequence of the original positions of the unrounded numbers sorted by absolute value in descending order. B3 loops through the rounded numbers to distribute the error using the base adjustment value according to the absolute values of the corresponding unrounded members. After that C3 recalculate the sum. Now the results of A2 and C3 are as follows:

esProc_round_off_control_4

This method changes the numbers as little as possible by distributing the round-off error in order between/among numbers ordered by absolute value in descending order. As with this example, the round-off error is successfully eliminated by rounding 1.48 up to 2 and 6.48 up to 7. But the efficiency will be compromised as a sorting operation by absolute value is specially required.

(3) Distribute round-off error among non-zero numbers using the base adjustment value.

As discussed above, the round-off error can be shared by numbers in order by absolute value. Since rounding to the nearest number won’t cause error to a zero and it’s too conspicuous to change a zero to a non-zero number, the method can be simplified by selecting only the first non-zero numbers for the distribution of the error while leaving zeros alone.

With this method, we can control the round-off error as follows:

  A B C
1 [1.48,0,1.42,0.32,6.48,0.98,1.39] =A1.sum()  
2 =A1.(round(~)) =round(B1)-A2.sum() =sign(B2)
3 =A2.pselect@a(~!=0) >abs(B2).run(A2(A3(#))+=C2) =A2.sum()

A3finds the sequence numbers of the non-zero members in the original sequence. B3 adjusts their rounded results sequentially to distribute the error. Here’re results of A2 and C3:

esProc_round_off_control_5

In this case, the error is controlled successfully by rounding 1.49 up to 2 and 1.42 up to 2. Without the sorting operation, the method achieves a relatively high efficiency and is the most commonly used.

In many cases, linear round-off error control handles not one sequence of numbers whose sum is calculated, but a batch of records that each needs an aggregate value, like the following table sequence of SalesRecord.txt:

Name      Jan  Feb  Mar Apr

Allen      26106     49637     27760     33829

Billy       56611      50588     54765     76072

Charlie    21249     96825     28645     55958

Daisy      3413       49069     6279       98247

Flora       7590       12072     90034     64252

The task is to calculate the total sales amount in the four months for each salesperson. The amount will be truncated at the hundreds place and the resulted round-off error should be eliminated. Below is the program for performing the task:

  A B C D
1 =file(“SalesRecord.txt”).import@t() =A1.derive(Jan+Feb+Mar+Apr:Sum) =B1.derive()  
2 >5.(C1.field(#+1,C1.field(#+1).(round(~/1000)))) =C1.derive(Jan+Feb+Mar+Apr:Sum2)    
3 for B2 >func(A5,A3)    
4 =B2.derive(Jan+Feb+Mar+Apr:Sum3)      
5 func =A5.Sum-A5.Sum2 =abs(B5) =sign(B5)*1
6   for C5 =A5.field(1+B6) >A5.field(1+B6,C6+D5)

A1 imports the file as a table sequence and B1 adds a Sum field to it:

esProc_round_off_control_6

C1 copies B1’s table sequence. A2 rounds values in the fields from the 2nd to the 6th by truncating them at the hundreds place. But this could cause inconsistencies. B2 then adds a Sum2 field to calculate the total sales amount during the four months using the rounded values. Now B2’s table sequence is as follows:

esProc_round_off_control_7

Not all values of Sum field are equivalent to those of Sum2 field. Round-off error control is thus required. It is still in linear direction since each value is used for only the total of the corresponding salesperson, even though it should be performed across the whole table sequence.

A5’s subroutine handles the round-off error control of each record. For each execution, B5 calculates the round-off error and D5 finds the base adjustment value; and then B6 runs a loop to simply distribute the error to values in a record in order without determining whether a value is zero or not.

A3 loops over B2’s records to remove the round-off error for each of them. When this loop finishes, A4 adds a Sum3 field to the table sequence to recalculate the sums, checking if the consistency is restored. Here’s A4’s result:

esProc_round_off_control_8

A comparison between Sum and Sum3 shows that the consistency between detailed data after rounding and the rounded total values has been recovered through the round-off error control.

2 Round-off error control in two perpendicular directions

It would be a much more complicated situation if totals and the grand total in both rows and columns are needed. Thus the round-off error control needs to keep agreement not only between the detailed data and the totals but also between the totals and the grand total in both directions. This is the round-off error control in two perpendicular directions. For example, the total for each month in the above SalesRecord.txt is needed. Here’s the code:

  A B C
1 =file(“SalesRecord.txt”).import@t() >A1.insert(0,”Total”) >4.(A1.m(-1).field(#+1,A1.field(#+1).to(, 5).sum()))
2 =A1.derive(Jan+Feb+Mar+Apr:Sum) =A2.derive() >5.(B2.field(#+1,B2.field(#+1).(round(~/1000))))
3 =B2.derive(Jan+Feb+Mar+Apr:Sum2) =A3.derive(Sum2-Sum:Diff) >B3.insert(0,”Total2″)
4 >5.(B3.m(-1).field(#+1, A3.field(#+1).to(,5).sum())) >B3.insert(0,”Diff”) >5.(B3.m(-1).field(#+1,B3(6).field(#+1)- B3(7).field(#+1)))

A1 imports the file as a table sequence. B1 appends a record of Total and C1 calculates the total sales amount in each month. A2 adds a Sum field to calculate the total sales amount for each salesperson and their grand total. Here’s A2’s result:

esProc_round_off_control_9

C2 truncates each value in the generated table sequence at the hundreds place. A3 adds a Sum2 field to find total for each salesperson using the rounded values. B3 adds a Diff field to find the error in each record. C3 and B4 respectively append a record to calculate the monthly totals and the errors. Here’s B3’s result:

esProc_round_off_control_10

You see the round-off error control becomes complicated with totals required in both the horizontal and vertical directions. A modification to any rounded value will affect the totals in both directions. The process of dealing with this issue is the mentioned round-off error control in two perpendicular directions. In the above table, certain round-off errors are linked only to the totals, like the Diff value in the record of Total that is related to the monthly totals. That is the total-related round-off error. In a table needing round-off error control in both rows and columns, such an error exist in both the horizontal and the vertical directions and all other round-off errors are non-total-related round-off errors linked only to the detailed data, like the Diff value in the Feb field.

Below are cases of round-off error control in two perpendicular directions from easy to difficult.

(1) The non-total-related round-off errors have the same signs in both directions. For example:

esProc_round_off_control_11

The table stores detailed data in two rows and two columns, with totals calculated in each row/column and the grand total obtained. In the following the all values are rounded to the nearest integers and the round-off error in each row/column is calculated:

esProc_round_off_control_12

Here the round-off errors involve only the detailed values. It’s no need to adjust the totals and the grand total. In this table there’re inconsistencies in the first row and the first column where the rounded total is greater than the total of the rounded detailed values. The solution is to adjust the value at their intersection by adding or subtracting the base adjustment value depending on the round-off error signs. For this case, we can resolve the inconsistency by adding 1 to the rounded value at the intersection of the first row and the first column:

esProc_round_off_control_13

(2) The non-total-related round-off errors with same direction but opposite signs. For example:

esProc_round_off_control_14

Round the above values to the nearest integers and calculate the round-off error in each row/column:

esProc_round_off_control_15

It’s still no need to adjust the totals. Since the round-off errors in the 1st and 2nd columns have the opposite signs, we just need to adjust the values in any non-error detailed rows by adding or subtracting the base adjustment value depending on the corresponding round-off error sign. Here we select the 1st row to eliminate the inconsistency by adding 1 to the rounded value in the 1st column and subtracting 1 from the rounded value in the 2nd column:

esProc_round_off_control_16

(3) The total-related round-off error and the non-total-related round-off error with different directions and opposite signs. For example:

esProc_round_off_control_17

Round the above values to the nearest integers and calculate the round-off error in each row/column:

esProc_round_off_control_18

For this type of cases, the solution is to adjust the total at the intersection by adding or subtracting the base adjustment value depending on the total-related round-off error sign. Here according to the total-related round-off error sign in the horizontal direction, we subtract 1 from the total of the first column:

esProc_round_off_control_19

(4) The total-related round-off error and the non-total-related round-off error with same direction and same signs. For example:

esProc_round_off_control_20

Round the above values to the nearest integers and calculate the round-off error in each row/column:

esProc_round_off_control_21

The total-related round-off error in the vertical direction has the same sign as the non-total-related round-off error in one column. The solution to this type of cases is to adjust the detailed value and the total value in any non-error row. Here we adjust the value in the 1st column of the 1st row by and the total value in the 1st row by adding 1 to them respectively:

esProc_round_off_control_22

(5) The total-related round-off errors with same signs but different directions. For example:

esProc_round_off_control_23

Round the above values to the nearest integers and calculate the round-off error in each row/column:

esProc_round_off_control_24

In this case, the inconsistency involves only the totals. The solution is to first adjust any detailed value by adding or subtracting the base adjustment value depending on the round-off error sign and then alter the totals in both horizontal and vertical directions. For this case we choose the value in the 2nd column of the 2nd row, adding 1 to it according to the round-off error sign while adding 1 to the totals in the 2nd row and the 2nd column respectively:

esProc_round_off_control_25

Alternatively, we can adjust the value in the 2nd column of the 1st row and the consistency can be obtained equally:

esProc_round_off_control_26

Only the above five types of round-off errors in the two perpendicular directions are controllable. Other round-off errors are the results of computational errors, which can’t be corrected by only a single adjustment. However, simultaneous appearance of several types of errors is common in real-world businesses. Facing such a situation, we can handle them in the order in which they are arranged in the above. (1) Eliminate the non-total-related round-off errors in a row and a column with the same signs. (2) Eliminate the non-total-related round-off errors in rows or columns with opposite signs; with this step done, the round-off errors in the non-total row and column can only have opposite signs. (3) and (4) Eliminate the non-total-related round-off error in a row/column or/and the total-related round-off error in a row/column. If there’re still the round-off errors in both directions, remove them with the method given in (5). That’s the procedure of round-off error control for ordinary tables.

Now back to the sales record table introduced at the beginning of the section. The following program is used to handle the round-off errors for it:
esProc_round_off_control_31
The program seems a little complicated, and the explanation runs as follows. A26’s subroutine modifies a record by adding the required base adjustment value to a value at the specified position. Since the Name field in the original table sequence doesn’t engage in the computation, it is skipped during the data preparation. C5 and D5 get a sequence of round-off errors in the horizontal direction and in the vertical direction respectively. Line 6 and 7 handle the round-off errors with method 1 that the rounded value at the intersection will be adjusted if the errors in both directions have the same signs. Lines of 8~13 perform the round-off error control for both the horizontal and vertical directions with method 2 that the rounded values in the row/column will be adjusted if the errors in the same direction have opposite signs. Lines of 14~17 eliminate the round-off errors with method 3 that total at the intersection will be adjusted if the total-related round-off error and the non-total-related round-off error in different directions have opposite signs. To deal with the round-off error, lines of 18~23 use method 4 that the rounded value and the total in a row/column will be adjusted if the total-related round-off error and the non-total-related round-off error in the same direction have same signs. After all these steps, line 24 and 25 check whether there’re total-related round-off errors in both directions. If the result is true, adjust the rounded result of the first value and the total values of the first row and column.

After the program finishes execution, the final result can be viewed in B2:

esProc_round_off_control_27

Below is the procedure of the round-off error control:

esProc_round_off_control_28

Step 1: Eliminate the non-total-related round-off errors in both directions with the same signs:

esProc_round_off_control_29

At this point, the round-off errors in each direction have the same signs. So skip step 2 to move on to the next step. But there isn’t a total-related round-off error and a non-total-related round-off error that in different directions that have opposite signs.

Then go on to check if there’s a total-related round-off error having the same sign as the non-total-related round-off error in the same direction. Here’s the result of performing step 4:

esProc_round_off_control_30

Now all round-off errors are zero, showing that the round-off error control has completed successfully and consistencies have been restored in both directions. 

FAVOR (1)
Leave a Reply
Cancel
Icon

Hi,You need to fill in the Username and Email!

  • Username (*)
  • Email (*)
  • Website