1. Testing purposes
Test esProc and Oracle on the same hardware to compare their single machine performances in carrying out a single big task and small concurrent tasks.
2. Testing contents and methods
Data volume:
Small data volume: Single fact table around 10G. To prevent the testing result from being affected by the operating system cache, different fact tables are accessed with multiple concurrent requests.
Large data volume: Single fact table around 100G.
Algorithms:
Several typical SQL algorithms, including data scan, join, data grouping with small result set and with large result set. The use of these simple algorithms is to make the comparison more intuitively and easier to understand, not because esProc and SQL are identical to each other. In fact, the two of them focus on different things. esProc is good at procedure computation for tasks with complicated business logic, while SQL is suitable for computations of average complexity. A complicated algorithm that can be realized with different SQL execution plans makes execution paths uncontrollable and lacks comparability. So the testing doesn’t include one.
Use Cases:
Multiple use cases are designed for testing each algorithm according to table widths and data types.
Among them, the purpose of the use cases of data grouping with large result set is to test the performances of Oracle and esProc in handling the scenarios where the result sets of the group operations are too large to be entirely accommodated by the memory. So this report tests the parallel processing of a task with an amount of 100G data, not the concurrent processing of small tasks.
Note: An appendix of esProc Oracle Single Machine Performance Comparison Process is added to this report. Please refer to it for details such as data structures, test code, and test reproduction.
3. Testing Environment
Testing Machine：Dell Power Edge T610
CPU：Intel Xeon E5620*2
RAM：20G
HDD：Raid5 1T
Operating System：CentOS 6.4
JDK：1.6
Oracle Version：11g
esProc Version:3.1
4. Data Description
The volume of the data is decided by the amount allowed by the exported text file.
For the single machine test, esProc uses the proprietary file format.
As the purpose is mainly to test the performance of big data computation and full table scan performance, no primary key or index is built for any table in the database. The primary key field described in data structure isn’t a physical key but a logical one defining that data repetition isn’t allowed.
4.1Data tables and table association
Fact Tables T1, T11, T12, and T13
They are wide data tables that each has 100 fields to simulate the realworld fact tables with a large number of fields. The four tables have identical structures. T11, T12, T13 are used for small concurrent accesses to different tables to avoid system cache.
Fact Tables T2, T21, T22, and T23
These are narrow data tables having identical structures and 11 fields for simulating the realworld facts tables with a small number of fields. T21, T22, T23 are used for small concurrent accesses to different tables to avoid system cache.
Fact tables are the main data source for this test, and are used in data scan, data grouping, and join operations. Tests are done for both large and small data volume, depending on the numbers of records inserted to the tables.
Dimension tables DL2, DL6, DD2, DD6, DC2, and DC6
The dimension tables are only used in the use cases of the join (and multijoin) operations where dimension tables and fact tables are joined or the dimension tables are joined between themselves. These tables have fixed data volume.
4.2. Data volume
The following statistics are the numbers of records in the testing tables, not the numbers of bytes that they occupy.
Table  Big single task  Small concurrent tasks 
T1  150M  15M 
T11/T12/T13  —  15M 
T2  1180M  118M 
T21/T22/T23  —  118M 
DL2  100  100 
DL6  1M  1M 
DD2  100  100 
DD6  1M  1M 
DC2  100  100 
DC6  1M  1M 
Note: The bigtask test on a single machine is intended to measure the performance when data requires an amount of memory that is far larger than the physical memory. The standard used by the testing is that a fact table is approximately 5 times larger than the physical memory and a dimension table is far smaller than the physical memory. The actual number of records in a table can be adjusted according to the machine configuration.
The test of small concurrent tasks on a single machine is intended to measure the performance when the amount of memory occupied by data is less than the physical memory, and the computation could be done completely in the memory. The standard used by the testing is that a fact table is approximately 50% of the size of the physical memory and a dimension table is far smaller than the physical memory. The actual number of records in a table can be adjusted according to the machine configuration. T11/T12/T13 and T21/T22/T23 are only used for the test of the concurrent computation of small tasks.
To make sure that esProc and Oracle are computing exactly the same data, the data generated in Oracle is exported to a file with the esProcdefined binary format, which is esProc’s data source for the test.
5. Use Cases Description
For better understanding, all the logics in the test are described in SQL.
During the test Oracle executes SQL statements directly, whereas functionally equivalent esProc programs are written to complete the same computation.
5.1. Use Case for bigdata scan
This use case is about bigtask single machine test. It’s designed to test the computational performance of full table scan for a large volume of data. Performances for full table counting, integer sum, float sum, value sum, integer filter, number filter, character filter, date filter are considered respectively.
Category  Use Case Number  SQL Description 
Full Table Counting  101  select count(*) from T1 
102  select count(*) from T2  
Integer sum  103  select sum(TL11) from T1 
104  select sum(TL11),sum(TL12),sum(TL13),sum(TL14),sum(TL15), sum(TL16), sum(TL17), sum(TL18), sum(TL19) from T1  
105  select sum(TL2) from T2  
106  select sum(L11),sum(FL6),sum(to_number(FD6)),sum(TL1),sum(TL2), sum(TL1*TL2) from T2  
Float Sum  107  select sum(TN21) from T1 
108  select sum(TN21), sum(TN22), sum(TN23), sum(TN24), sum(TN25), sum(TN26), sum(TN27), sum(TN28), sum(TN29) from T1  
109  select sum(TN3) from T2  
110  select sum(TN3*0.1), sum(TN3*0.5), sum(TN3), sum(TN3*1.5), sum(TN3*2), sum(TN3*10), sum(TN3*100), sum(TN3*1000), sum(TN3*10000), sum(TN3*100000) fromT2  
Value Sum  111  select sum(TN11) from T1 
112  select sum(TN11), sum(TN12), sum(TN13), sum(TN14), sum(TN15), sum(TN16), sum(TN17), sum(TN18), sum(TN19) from T1  
113  select sum(TN2) from T2  
114  select sum(TN1), sum(TN2), sum(TL1*TN1), sum(TL1*TN2), sum(TL2*TN1), sum(TL2*TN2) from T2  
Integer Filter  115  select count(*) from T1 where FL6 between 100000 and 599999 
116  select sum(TL11),sum(TL12),sum(TL13),sum(TL14),sum(TL15), sum(TL16), sum(TL17), sum(TL18), sum(TL19) from T1 where FL6 between 100000 and 599999  
117  select count(*) from T2 where FL6 between 100000 and 599999  
118  select sum(L11),sum(FL6),sum(to_number(FD6)),sum(TL1),sum(TL2), sum(TL1*TL2) from T2 where FL6 between 100000 and 599999  
Number Filter  119  select count(*) from T1 where FD6 between ‘100000’ and ‘599999’ 
120  select sum(TN11), sum(TN12), sum(TN13), sum(TN14), sum(TN15), sum(TN16), sum(TN17), sum(TN18), sum(TN19) from T1 where FD6 between ‘100000’ and‘599999’  
121  select count(*) from T2 where FD6 between ‘100000’ and ‘599999’  
122  select sum(TN1), sum(TN2), sum(TL1*TN1), sum(TL1*TN2), sum(TL2*TN1), sum(TL2*TN2) from T2 where FD6 between ‘100000’ and ‘599999’  
Character Filter  123  select count(*) from T1 where FC6 between ‘A00000’ and ‘E99999’ 
124  select sum(TN11), sum(TN12), sum(TN13), sum(TN14), sum(TN15), sum(TN16), sum(TN17), sum(TN18), sum(TN19) from T1 where FC6 between ‘A00000’ and‘E99999’  
125  select count(*) from T2 where FC6 between ‘A00000’ and ‘E99999’  
126  select sum(TN1), sum(TN2), sum(TL1*TN1), sum(TL1*TN2), sum(TL2*TN1), sum(TL2*TN2) from T2 where FC6 between ‘A00000’ and ‘E99999’  
Date Filter  127  select count(*) from T1 where R4 between to_date(‘19800101’, ‘yyyymmdd’) and to_date(‘19800101’,’yyyymmdd’) + 5000 
128  select sum(TN11), sum(TN12), sum(TN13), sum(TN14), sum(TN15), sum(TN16), sum(TN17), sum(TN18), sum(TN19) from T1 where R4 between to_date(‘19800101’, ‘yyyymmdd’) and to_date (‘19800101’,’yyyymmdd’) + 5000  
129  select count(*) from T2 where R4 between to_date(‘19800101’, ‘yyyymmdd’) and to_date(‘19800101’,’yyyymmdd’) + 5000  
130  select sum(TN1), sum(TN2), sum(TL1*TN1), sum(TL1*TN2), sum(TL2*TN1), sum(TL2*TN2) from T2 where R4 between to_date(‘19800101’, ‘yyyymmdd’) and to_date (‘19800101’, ’yyyymmdd’) + 5000 
5.2. Use Case for bigdata grouping with small result set
This use case is about bigtask single machine test. It’s designed to test the performance of grouping computation involving large data volume, where the number of records in each group is small, smaller than the physical memory. Performances for integer grouping, number grouping, character grouping and date grouping are considered respectively.
Category  Use Case Number  SQL Description 
Integer Grouping  201  select FL6,count(*) from T1 group by FL6 
202  select FL6, sum(TL11),sum(TL12),sum(TL13),sum(TL14),sum(TL15), sum(TL16), sum(TL17), sum(TL18), sum(TL19) from T1 group by FL6  
203  select FL6,count(*) from T2 group by FL6  
204  select FL6, sum(L11),sum(FL6),sum(to_number(FD6)), sum(TL1), sum(TL2), sum(TL1*TL2) from T2 group by FL6  
Number Grouping  205  select FD6, count(*) from T1 group by FD6 
206  select FD6, sum(TN11), sum(TN12), sum(TN13), sum(TN14), sum(TN15), sum(TN16), sum(TN17), sum(TN18), sum(TN19) from T1 group by FD6  
207  select FD6, count(*) from T2 group by FD6  
208  select FD6, sum(TN1), sum(TN2), sum(TL1*TN1), sum(TL1*TN2), sum(TL2*TN1), sum(TL2*TN2) from T2 group by FD6  
Character Grouping  209  select FC6, count(*) from T1 group by FC6 
210  select FC6, sum(TN11), sum(TN12), sum(TN13), sum(TN14), sum(TN15), sum(TN16), sum(TN17), sum(TN18), sum(TN19) from T1 group by FC6  
211  select FC6, count(*) from T2 group by FC6  
212  select FC6, sum(TN1), sum(TN2), sum(TL1*TN1), sum(TL1*TN2), sum(TL2*TN1), sum(TL2*TN2) from T2 group by FC6  
Date Grouping  213  select R2,R4, count(*) from T1 group by R2,R4 
214  select R2,R4, sum(TN11), sum(TN12), sum(TN13), sum(TN14), sum(TN15), sum(TN16), sum(TN17), sum(TN18), sum(TN19) from T1 group by R2,R4  
215  select R2,R4,count(*) from T2 group by R2,R4  
216  select R2,R4, sum(TN1), sum(TN2), sum(TL1*TN1), sum(TL1*TN2), sum(TL2*TN1), sum(TL2*TN2) from T2 group by R2,R4 
5.3. Use Case for bigdata grouping with large result set
This use case is about bigtask single machine test. It’s designed to test the performance of grouping computation involving large data volume, where the number of records in each group is large, larger than the physical memory, and thus the computation cannot be entirely completed within the memory.
Category  Use Case Number  SQL Description 
Integer Grouping  301  select mod(FL11,10000000), count(*) from T1 group by mod(FL11,10000000) 
302  select mod(FL11,10000000), sum(TL11),sum(TL12),sum(TL13), sum(TL14), sum(TL15), sum(TL16), sum(TL17), sum(TL18), sum(TL19) from T1 group by mod(FL11,10000000) 
5.4. Use Case for bigdata joins
This use case is about bigtask single machine test. It’s designed to test the performance of join computations involving large data volume. Performances for integer, number, character joins, as well as singlelevel and multilevel joins, are considered respectively.
Category  Use Case Number  SQL Description 
Integer Join  401  select count(*) from T1 join DL6 on T1.FL6=DL6.P 
402  select sum(T1.TL11),sum(T1.TL12), sum(T1.TL13), sum(T1.TL14), sum(T1.TL15), sum(T1.TL16), sum(T1.TL17), sum(T1.TL18), sum(T1.TL19), sum(DL6.L2) from T1 join DL6 on T1.FL6=DL6.P 

403  select count(*) from T2 join DL6 on T2.FL6=DL6.P  
404  select sum(T2.L11), sum(T2.FL6), sum(to_number(T2.FD6)), sum(T2.TL1),sum(T2.TL2), sum(T2.TL1*T2.TL2), sum(DL6.L2) from T2 join DL6 on T2.FL6=DL6.P  
Integer Multilevel Join  405  select count(*) from T1 join DL6 on T1.FL6=DL6.P join DL2 on DL6.FL2=DL2.P 
406  select sum(T1.TL11),sum(T1.TL12), sum(T1.TL13), sum(T1.TL14), sum(T1.TL15), sum(T1.TL16), sum(T1.TL17), sum(T1.TL18), sum(T1.TL19), sum(DL6.L2), sum (DL2.L2) from T1 join DL6 on T1.FL6=DL6.P join DL2 on DL6.FL2=DL2.P 

407  select count(*) from T2 join DL6 on T2.FL6=DL6.P join DL2 on DL6.FL2=DL2.P  
408  select sum(T2.L11), sum(T2.FL6), sum(to_number(T2.FD6)), sum(T2.TL1),sum(T2.TL2), sum(T2.TL1*T2.TL2), sum(DL6.L2), sum(DL2.L2) from T2 join DL6 on T2.FL6=DL6.P join DL2 on DL6.FL2=DL2.P 

Number Join  409  select count(*) from T1 join DD6 on T1.FD6=DD6.P 
410  select sum(T1.N11), sum(T1.TN12), sum(T1.TN13), sum(T1.TN14), sum(T1.TN15), sum(T1.TN16), sum(T1.TN17), sum(T1.TN18), sum(T1.TN19), sum(DD6.N5) from T1 join DD6 on T1.FD6=DD6.P 

411  select sum(T2.TL2*DD6.L2) from T2 join DD6 on T1.FD6=DD6.P  
412  select sum(T2.TN1), sum(T2.TN2), sum(T2.TL1*T2.TN1), sum(T2.TL1*T2.TN2), sum(T2.TL2*T2.TN1), sum(T2.TL2*T2.TN2), sum(DD6.N5) from T2 join DL6 on T2.FL6=DL6.P 

Number Multilevel Join  413  select count(*) from T1 join DD6 on T1.FD6=DD6.P join DD2 on DD6.FD2=DD2.P 
414  select sum(T1.N11), sum(T1.TN12), sum(T1.TN13), sum(T1.TN14), sum(T1.TN15), sum(T1.TN16), sum(T1.TN17), sum(T1.TN18), sum(T1.TN19), sum(DD6.N5), sum (DD2.N5) from T1 join DD6 on T1.FD6=DD6.P join DD2 on DD6.FD2=DD2.P 

415  select count(*) from T2 join DD6 on T2.FD6=DD6.P join DD2 on DD6.FD2=DD2.P  
416  select sum(T2.TN1), sum(T2.TN2), sum(T2.TL1*T2.TN1), sum(T2.TL1*T2.TN2), sum(T2.TL2*T2.TN1), sum(T2.TL2*T2.TN2), sum(DD6.N5), sum(DD2.N5) from T2 join DD6 on T2.FD6=DD6.P join DD2 on DD6.FD2=DD2.P 

Character Join  417  select count(*) from T1 join DC6 on T1.FC6=DC6.P 
418  select sum(T1.N11), sum(T1.TN12), sum(T1.TN13), sum(T1.TN14), sum(T1.TN15), sum(T1.TN16), sum(T1.TN17), sum(T1.TN18), sum(T1.TN19), sum(DC6.N5) from T1 join DC6 on T1.FC6=DC6.P 

419  select count(*) from T2 join DC6 on T2.FC6=DC6.P  
420  select sum(T2.TN1), sum(T2.TN2), sum(T2.TL1*T2.TN1), sum(T2.TL1*T2.TN2), sum(T2.TL2*T2.TN1), sum(T2.TL2*T2.TN2), sum(DC6.N5) from T2 join DC6 on T2.FC6=DC6.P 

Character Multilevel Join  421  select count(*) from T1 join DC6 on T1.FC6=DC6.P join DC2 on DC6.FC2=DC2.P 
422  select sum(T1.N11), sum(T1.TN12), sum(T1.TN13), sum(T1.TN14), sum(T1.TN15), sum(T1.TN16), sum(T1.TN17), sum(T1.TN18), sum(T1.TN19), sum(DC6.N5), sum (DC2.N5) from T1 join DC6 on T1.FC6=DC6.P join DC2 on DC6.FC2=DC2.P 

423  select count(*) from T2 join DC6 on T2.FC6=DC6.P join DC2 on DC6.FC2=DC2.P  
424  select sum(T2.TN1), sum(T2.TN2), sum(T2.TL1*T2.TN1), sum(T2.TL1*T2.TN2), sum(T2.TL2*T2.TN1), sum(T2.TL2*T2.TN2), sum(DC6.N5), sum(DC2.N5) from T2 join DC6 on T2.FC6=DC6.P join DC2 on DC6.FC2=DC2.P 
5.5. Use cases for performing small concurrent tasks
These use cases test the performance of carrying out small concurrent tasks that access various physical tables or files to avoid OS cache. Tests are done for three types operations – data scanning, data grouping and joins – using the above use cases 5.1, 5.2 and 5.4 where the data volume is set as small (See the section of Data Volume).
Scenarios of single task, dualtask concurrency, and fourtask concurrency are tested respectively.
6. Use Cases for Testing
6.1. Small concurrent scans
This use case tests Oracle and esProc for scanning performance against small tables (files). The test is done with a multitask concurrent access mode. Each task is accessing a different table (file). Oracle executes 16threads parallel processing, while esProc executes 4 tasks in parallel, because it is proved through the test that those are the very number of parallel tasks for highest performance.
Testing statistics:
Note: In the following table, the time unit is second. 2 Concurrencies indicates 2 SQL statements are executed simultaneously; and the same with 4 concurrencies.
Statistics interpretation:
1. Data with blue background is the peak values obtained by this use case. Both Oracle and esProc reach peak performance with 1 concurrency, but Oracle’s performance is several times higher than esProc.
2. Performance degrades significantly both for Oracle and esProc when the number of concurrencies increases to 2. Oracle still has advantage but much of the gap has been closed.
3. esProc handles each task with equal performance, while Oracle is extremely unstable.
6.2. Bigdata scan
This use case tests Oracle and esProc for their performance in scanning big tables (files). Scenarios of 1, 2 and 4 parallel tasks are tested respectively.
Testing statistics:
Category  Use Case Number  Oracle  esProc  
1 Parallel  2 Parallels  1 Parallel  2 Parallels  4 Parallels  
Full Table Counting  101  280  329  243  235  272 
102  311  383  271  214  246  
Integer sum  103  288  328  243  241  268 
104  324  385  289  210  197  
105  375  412  298  220  246  
106  928  1011  725  416  210  
Float Sum  107  321  350  247  247  267 
108  389  424  297  206  200  
109  399  436  303  221  248  
110  1623  1816  885  492  321  
Value Sum  111  313  347  260  248  270 
112  423  450  483  251  197  
113  428  463  457  243  248  
114  1332  1420  2148  1188  549  
Integer Filter  115  277  322  256  251  266 
116  318  345  302  209  199  
117  335  364  376  231  245  
118  676  709  675  362  188  
Number Filter  119  267  331  262  234  268 
120  350  377  450  260  201  
121  338  367  433  253  246  
122  830  909  1540  770  390  
Character Filter  123  313  322  264  236  268 
124  361  388  486  236  201  
125  339  370  422  249  246  
126  887  973  1642  840  419  
Date Filter  127  274  320  265  230  267 
128  273  321  363  209  219  
129  346  376  1413  711  378  
130  801  912  2482  1322  647 
Statistics interpretation:
1. Oracle is at peak performance with 1 thread, but performance starts to degrade with 2 parallels. However, esProc’s performance with 4 parallel tasks is generally higher than with 1 task, sometimes even several times higher, excepting for a few instances.
2. esProc demonstrates an overall advantage over Oracle in this use case test. This is particularly obvious with narrow tables and computationintensive cases, such as 106, 110 and 114. Use cases 129 and 119 are exceptions, however, where Oracle performs slightly better than esProc.
3. esProc is observed to be capable of increasing the computational performance several times with the increase of parallel number, as in the use cases 106, 110, 114, 118, 122, 126 and 130. These are all about narrow table access, and with more computations.
4. In some cases esProc’s performance also degrade, as in the use cases 103, 107, 111, 115, 119, 119, 123 and 127. These are all about wide table access with fewer computations.
6.3. Small concurrent grouping tasks
This use case tests Oracle and esProc for grouping computation performance against small tables (files). The test is done in a multitask concurrent access mode. Each task accesses different tables (files).
Testing statistics:
Note: According to the data volume, esProc can choose to do pure inmemory computation or mixedmodeofinmemoryandexternalmemory computation. This use case uses inmemory computation, while the grouping use case with large data volume in later part of this report is handled with a mixed mode. Pure inmemory computation has the risk of memory overflow. Thus use case uses 1, 2 and 4 concurrencies to avoid that. Data in green is for 1 concurrency, data in red for 2 concurrencies and data in black for 4 concurrencies. Green character reflects parallel 1, red character is for parallel 2, and black for parallel 4. Oracle forces a mixed mode computation to avoid memory overflow. Each task is performed with a fixed number of 16 parallel threads to achieve the best performance.
Statistics interpretation:
1. Oracle performs better than esProc in general, especially with 1 concurrency when the performance of Oracle can be several times better.
2. Oracle is extremely unstable in the performances of concurrent tasks, usually several times faster or slower between one and another. esProc’s performance, however, is generally stable.
3. All date grouping operations are twolevel grouping, which, as we can see, Oracle performs better.
6.4. Bigdata grouping
This use case tests Oracle and esProc for grouping computation performance against large tables (files). The test is done in a nonconcurrent mode using parallel threads of 1, 2 and 4.
Testing statistics:
Category  Use Case Number  Oracle  esProc  
1 Parallel  2 Parallels  1 Parallel  2 Parallels  4 Parallels  
Integer Filter Grouping  201  365  399  401  252  285 
202  769  673  542  302  255  
203  930  1042  1018  567  342  
204  2660  1712  1867  1115  607  
Number Grouping  205  368  408  387  252  286 
206  1028  1247  1124  690  486  
207  930  1041  1340  759  371  
208  4058  2781  4352  2459  1493  
Character Grouping  209  365  409  374  258  293 
210  1012  1184  1017  651  488  
211  906  1032  1312  710  391  
212  3579  2427  4409  2488  1474  
Date Grouping  213  304  332  278  259  271 
214  454  478  509  273  203  
215  1042  1164  1318  756  426  
216  4730  3224  4910  2542  1481 
Statistics interpretation:
1. Oracle is at peak performance with 1 parallel, and performance starts to decrease with 2. However, esProc’s performance with 4 parallel tasks is normally higher than with 1 task, even several times higher in some cases.
2. esProc’s performance is several times higher than Oracle in this use case test.
6.5. Small concurrent joins
This use case tests Oracle and esProc for performance of joining small (files). The test is done in a multitask concurrent access mode. Each task accesses different tables (files). Parallel processing won’t be used for Oracle because no performance improvement can be achieved with it. esProc uses 4thread parallel processing because more threads won’t increase the performance.
Testing statistics:
Statistics interpretation:
1. esProc is one time or several times faster than Oracle.
2. esProc’s performance variation among concurrent tasks is very small. Oracle’s performance variation remains big, but less than the variations in other use cases.
3. Parallel mode yields no performance gain for Oracle, while esProc reaches peak performance with 4 parallel tasks.
6.6 Bigdata joins
This use case tests Oracle and esProc for their performance in joining large tables (files) . The test is done in a singletask, nonconcurrent mode. Oracle reaches peak performance with 1 parallel task, while esProc reaches its own with 4 parallel tasks.
Testing statistics:
Category  Use Case Number  Oracle  esProc  
1 Parallel  2 Parallels  2 Parallels  4 Parallels  
Integer Join  401  347  383  248  271 
402  446  469  210  204  
403  837  910  505  271  
404  1555  1709  810  402  
Integer Multilevel Join  405  352  377  247  270 
406  453  507  213  205  
407  829  945  504  257  
408  1587  1810  847  412  
Number Join  409  353  384  245  275 
410  509  549  324  199  
411  1077  1152  840  451  
412  1767  1926  1704/1792  903/889  
Number Multilevel Join  413  354  385  243  275 
414  521  557  331  198  
415  863  946  827  450  
416  1830  2024  1825  943  
Character Join  417  354  385  246  268 
418  509  554  322  198  
419  868  944  513  278  
420  1757  1932  1689  909  
Character Multilevel Join  421  356  388  251  277 
422  523  549  320  201  
423  863  957  513  285  
424  1827  2027  1776  905 
Statistics interpretation:
1. esProc performs better than Oracle.
2. With multilevel join, performance degradation for both is little.
6.7. Bigdata grouping with big result set
This use case tests Oracle and esProc for grouping computation performance against large tables (files), with the grouping results are too big to be all stored in memory. The test is done in a singletask nonconcurrent mode using 1, 2 and 4 parallel threads respectively.
Testing statistics:
Category  Use Case Number  Oracle  esProc  
1 Parallel  2 Parallels  4 Parallels  1 Parallel  2 Parallels  4 Parallels  
Integer Filter Grouping  301  585  570  667  649  419  385 
302  1059  1010  1098  1067  706  656 
Statistics interpretation:
1. esProc performs better than Oracle.
2. Oracle is at peak performance with 2 parallel tasks. esProc is at peak performance with 4 parallel tasks.
7. Summary Analysis
From all use case tests we can generally reach the following conclusions about the testing statistics:
1. Oracle normally performs better than esProc with small concurrent tasks, even several times higher in some cases. But, there’s the exception of joins where esProc’s performance surpasses Oracle’s.
2. For bigdata, singletask computation, esProc’s performance is always higher than Oracle’s, several times higher sometimes.
3. Performance degrades significantly both for Oracle and esProc when concurrent number changes from 1 to 2. Oracle still has advantage but the difference between the two is small enough.
4. For small concurrent tasks, esProc shows stable performance with each task completed using approximately the same time, while Oracle demonstrates great variation among different tasks (normally the variation is up to ten times). However, there are exceptions. Oracle has a very small performance variation among different tasks during join operations, and parallel processing won’t improve the performance in such situations.
Causes of the above conclusions could be as follows:
1. esProc is JAVAcoded and interpreted for execution, and thus has a lower efficiency than Oracle, which is coded in native C language. The latter has a sophisticated multitier cache mechanism which enables a much higher performance in inmemory computations. As small concurrent computations can normally be cached, Oracle generally performs better than esProc in such use cases.
2. For bigdata, singletask computation, esProc’s performance is always higher than Oracle. This is because a large volume of data cannot be cached, and Oracle’s main method for performance improvement is inapplicable. But in such a situation, esProc has an obvious advantage with its more efficient functionbased algorithm.
3. With the concurrent computation with small data volume, overall performance of both degrades significantly with the parallel number changed from 1 to 2, and then to 4. This is a signal of incompetence of the processing power of CPU. Under this circumstance, Oracle still has the advantage, but only slightly. This is because Oracle doesn’t have the time to read the cache, and the cache mechanism almost stops working though the only advantage of native code remains. From this we can see that, basically, the performanceenhancing method for Oracle is effectively leveraging cache.
4. Since Oracle’s cache mechanism and execution plan cannot be managed manually, thus with small concurrent tasks Oracle has a significant performance variation, which sometimes is up to ten times.
5. Smalldata joins are an exception. In those cases, esProc’s performance is one or several times higher than Oracle’s and the Oracle performance variation among concurrent tasks is small with parallel processing being invalid. Probably this is caused by the same reason that Oracle’s cache mechanism and execution plan cannot be managed manually, making a join automatically treated as an externalmemory computation. The externalmemory computation involves competitions for hard disk resources, which leads to the lack of performance improvement with parallel processing. Parallel processing is effective in enhancing esProc’s performance because esProc can load the small dimension tables into the memory, and thus leverage the multicore CPU for computation. The small variation among concurrent tasks is possibly also because that Oracle automatically treats a join as externalmemory computation, which does not require cache, and that resource is equally distributed among concurrent processes. With join computations, Oracle’s advantage of using cache disappears, but the performance enhancement with native code is limited, whereas esProc gains an advantage with efficient functionbased algorithm and thus achieves a higher performance.
Final conclusion: Oracle performs better in handling computational tasks with small data or with fewer computations and simple algorithms. esProc performs better in handling bigdata tasks, computationintensive tasks, or those with complicated logics.
Pleast Click here to download the entire document.