esCalc Operations: Distinct and Group Operations

During data analysis and computing with spreadsheets, you often need to group data according to the computing target. For example, group order records by clients or the manufacturing data by months. In an esCalc spreadsheet, you can use the group operation to divide bands into different groups according to values of homo-cells. One of its related operations is distinct operation, which includes removing records with duplicate values and other choices, according to whether or not values in homo-cells are the same. For example, calculate the number of different clients and of the states where employees come from. Here we’ll discuss the use of the two operations in esCalc.

1. Distinct operation

Choose Operation>Distinct on the right-click context menu to perform the distinct operation, which handles data in the homo-cells, including the selected cell, under the same parent row based on its uniqueness or nonuniqueness. esCalc also provides Operation>Distinct on the menu bar and the shortcut key Ctrl+Alt+I to do this.

This is an employee table:

esCalc_operation_distinct_group_1

The employees work in different departments and come from different states. Now let’s rearrange data by departments through distinct operation to find the departments where employees are working.

Choose D2 and perform the distinct operation, and a Distinct window, as shown below, will pop up:

esCalc_operation_distinct_group_2

The window offers four ways of implementing the operation: Merge same value – which merges records with the same values, Distinct – which removes records with duplicate values, Delete same value – which removes records with the same values, and Delete unique value – which removes records with unique values. Besides, there is the option Adjacent only, which compares only the neighboring values when deciding whether records have same values or not.

First let’s choose Distinct to perform the operation and the result is as follows:

esCalc_operation_distinct_group_3

By comparison with the original table, you may find that you’ve got 3 different departments through performing Distinct and only the record of the first employee in each department has been retained, and all the other records have been deleted.

In the original table, still select D2 and perform Distinct by using the shortcut key Ctrl+Alt+I and checking Adjacent only:

esCalc_operation_distinct_group_4

Here’s the result:

esCalc_operation_distinct_group_5

Compared with the original table, you can find that the operation with Adjacent only has only retained the first of the neighboring records with same values and kept the nonadjacent records with the same values unchanged.

The distinct operation will get the same result whichever homo-cell is chosen. Let’s find the states that have only one employee from the original table. To do this you can choose E10 and perform Delete same value:

esCalc_operation_distinct_group_6

Here’s the result:

esCalc_operation_distinct_group_7

So only the state of Florida has one employee and all the other states have multiple employees. This type of distinct operation has deleted records with duplicate state values.

Delete unique value is the inverse of Delete same value. It deletes records without duplicate values. If you choose E10 in the original table and perform Delete unique value, the result will be as follows:

esCalc_operation_distinct_group_8

Both Delete same value and Delete unique value operations can work with Adjacent only to compare only the neighboring records. Suppose you choose E10 in the original table and perform Delete unique value, with Adjacent only checked:

esCalc_operation_distinct_group_9

Then you’ll get the following result:

esCalc_operation_distinct_group_10

The operation has deleted all employee records whose adjacent records haven’t the same state value.

Look at another employee table in which the detail employee data is placed under master rows holding department names:

esCalc_operation_distinct_group_11

Now choose D2 and perform Merge same value according to the department names in the master rows:

esCalc_operation_distinct_group_12

Here’s the result:

esCalc_operation_distinct_group_13

Similar to the Distinct, the Merge same value type of distinct operation only retains the first band among the duplicates. The sub-records under the duplicate bands, if any, will be merged into the first band.

2. Group operation

The group operation can be performed by choosing Operation>Group on the right-click context menu. It performs equal grouping on the selected cell’s homo-cells, assigning their unique values to the master cells of the newly added group rows. You can also perform the operation through Operation> Groupon the menu bar or the shortcut key Ctrl+Alt+G.

The following is the first employee table used in the preceding section:

esCalc_operation_distinct_group_14

Group the data by departments where employees are working. To do this you can choose D2 and perform the group operation, and a Group window will appear:

esCalc_operation_distinct_group_15

On it there are two options: Sort before group and Regroup. If you choose the former option, then you can check Use Locale for the sorting. esCalc also provides two ways of performing a group – Group for grouping data and Ungroup for dismissing the existing groups. Like Dismantle master group operation explained in esCalc Spreadsheet Editing: Bands, The Ungroup will dismantle the selected group level.

By default the group operation will carry out Sort before group. So after Group is executed, you’ll get the following result:

esCalc_operation_distinct_group_16

The operation divides data into 3 groups according to the department name in D2 and place employee records in corresponding departments. In each department, the order of employee records is the same as that in the original table. The newly added department rows have the initial appearance properties, and the department data is stored in cells of column D, which are set as the master cells of the bands of departments. Since data is already sorted before grouping, the order of groups is different from the original order of department values. This is different from the case of distinct operations. But they do have one common point – whichever homo-cell you choose for performing a group, the result will be the same.

You can modify the group rows in the above cellset as follows:

esCalc_operation_distinct_group_17

The appearance of group rows has changed, their master cells have become B2 and its homo-cells, and a formula =={B3}.count() has been entered in D2 to calculate the number of employees in each department.

Now let’s regroup the data by the states where employees come from. To do this you can choose E3. As the row where E3 sits has already had a master row, you should check Regroup on the Group window, as well as Sort before group:

esCalc_operation_distinct_group_18

Below is the result of regrouping:

esCalc_operation_distinct_group_19

The result shows that the row where the selected cell stays and its homo-rows have been reorganized, even though they were not under the same parent row. All the rows were first sorted and then regrouped by the states. Notice that cells of the new parent rows in the resulting cellset have the same appearance as those in the old one, and use the same master cells. D2’s formula is still there in the new parent row. The formula and those in D2’s homo-cells still calculate the number of records in each group. Of course this time the result is the total number of employees coming from each state. Additionally, all the cells in the new parent rows have been cleared of values, and thus related calculation cells and calculation cells have re-calculated.

Unlike the first grouping, the regrouping doesn’t assign group values to the new parent rows. This is because the parent rows may hold formulas, and the value-assigning could damage the existing data. Therefore you need to set the group values manually after the regrouping is done. Whereas setting master cells and group values automatically won’t have such a problem during the first grouping when the parent rows are all newly generated.

But what could you get if you want to regroup the data in the above cellset by genders, without checking Regroup? Choose C3 and perform the group operation:

esCalc_operation_distinct_group_20

Here’s the result you’ll get:

esCalc_operation_distinct_group_21

As can be seen from the result, now that the data has been grouped by the states, grouping it again by genders means adding a sub-layer to the grouping criterion, that is to say, grouping data in each state further. The newly added level of gender groups get group values automatically and, meanwhile, the cells where they are held are set as the master cells at the new level. esCalc allows every band to have a homo-cell set as the master cell. The sorting by genders before this regrouping also took place within each state.

Finally, let’s look at the case where no auto-sorting happens before grouping data. The following is the original table

esCalc_operation_distinct_group_22

Choose D2 to group data by departments, with Sort before group unchecked:

Here’s the result:

esCalc_operation_distinct_group_23

You can see that the operation just put the neighboring records in which the department values are the same into the same group. Most of the time this may produce duplicate group values.

1. Distinct operation

2. Group operation

Hi，You need to fill in the Username and Email！