The More Convenient Data Computing Scripts I

Uncategorized 689 0

Database plays an irreplaceable role in the modern economy and is widely used in the business computing areas like Enterprise Resources Planning (ERP), Customer Relation Management (CRM), Supply Chain Management (SCM), and the Decision Support System (DSS).

Computation on the structured data in the database mainly relies on SQL (Structured Query Language). SQL is the powerful, simple-to-use, and widely-applied database computing script. However, it has some native drawbacks: non-stepwise computation, incomplete set-lization, and no object reference available. Although almost all vendors have introduced and launched some non-compatible solution, such as various stored procedure like PL-SQL®, T-SQL®. These improved alternatives cannot remedy the native SQL drawbacks.

esProc solves these drawbacks with more powerful computational capability, much lower technical requirement, and broader scope of application. It is a more convenient database computing scripts.

I. Step-by-step Computation
Case Description

A multinational retail enterprise needs to collect statistics on the newly opened retail store, including: How many new retail stores will open in this year? Of which how many companies have the sales over 1 million dollars? Among these companies with over-1-million sales, how many companies are abased overseas?

This question is progressive. The three questions are mutually related, the next question can be regarded as the further exploring on the current question, fit for step-by-step computation.

The original data is from the database of stores table with the main fields: storeCode, storeName, openedTime, profit, and nation. Let’s check the SQL solution first.

SQL Solution
To solve such problem with SQL, you will need to write 3 SQL statements as given below.

SELECT COUNT(*) FROM stores WHERE to_char (openedTime, ‘yyyy’) = to_char (sysdate,’yyyy’);
SELECT COUNT(*) FROM stores WHERE to_char (openedTime, ‘yyyy’) = to_char (sysdate,’yyyy’) and profit>1000000;
SELECT COUNT(*) FROM stores WHERE to_char (openedTime, ‘yyyy’) = to_char (sysdate,’yyyy’) and profit>1000000 and nation<>’local’;

SQL1:Get the result of question 1.

SQL2:Solve the problem 2.Because the step-by-step computation is impossible (that is, the results of previous computation cannot be utilized), you can only solve and take it as an individual problem.

SQL3: Solve the problem 3,and you are not allowed to compute in steps either.

esProc Solution

esproc_computing_script1

A1 cell: Get the records requested in problem 1.

A2 cell: Step-by-step computation. Operate on the basis of cell A1, and get the record meeting the conditions of problem 2.

A3 cell: Proceed with the step-by-step computation, and get the records requested in the problem 3.

B1, B2, and B3 cell: It is still the step-by-step computation. Count the corresponding records.

For the SQL, there are 3 associations for you to compute in steps, and explore progressively. However, because step-by-step computation is hard to implement with SQL, this problem has to be divided into 3 individual problems.

esProc is to compute in steps following the natural habit of thinking: Decompose the general objective into several simple objective; Solve every small objective step by step; and ultimately complete the final objective.

In case that you proceed with the computation on the basis of the original 3 problems, for example, seek “proportion of problem 3 taken in the problem 2”, or “on” problem 3, group by country”. As for esProc users, they can simply write ”=A3/A2”, and ”A3.group(nation)”. In each step, there is a brief and clear expression of highly readable, without any requirements on a strong technical background. By comparison, SQL requires redesigning the statement. The redesigned statement will undoubtedly become more and more complex and longer. Such job can only be left to those who have the advanced technical ability in SQL.

esProc can decompose the complex problem into simple computation procedure based on the descriptions from the business perceptive. This is just the advantage of the step-by-step computation. By comparison, SQL does not allow for computation by step or problem decomposition, and thus it is against the scientific methodology, and not fit for the complex computation.

II.Complete Set-lization
Case Description
A certain advertisement agency needs to compute the clients whose annual sales values are among the top 10.
The data are from the sales table, which records the annual sales value of each client with the fields like customer, time, and amount.

SQL solution
SELECT customer
FROM (
SELECT customer
FROM (
SELECT customer,RANK() OVER(PARTITION BY time ORDER BY amount DESC) rankorder
FROM sales )
WHERE rankorder<=10) GROUP BY customer HAVING COUNT(*)=(SELECT COUNT(DISTINCT time) FROM sales)

Such Problem requires ranking the sets of a set, that is, group by “time” and then rank by “customer” in the group. Since the popular SQL-92 syntax is still hard to represent this, the SQL-2003 standard, which is gradually supported by several vendors, will be used to solve this problem barely.

Just a tip to compute the customer intersections in the last step, the count of years equals to the count of clients.

esProc Solution

esproc_computing_script2

A1: Group the original dataset by year so that A1 will become a set of sets.

B1: Get the serial number of records whose sales values are among the top 10 of each group. The rank() is used to rank in every group, and pselect() can be used to retrieve the serial number on conditions. ~ is used to represent every member in the set. B1 is the “set of set”.

A2: Retrieve the record from A1 according to the serial number stored in B2, and get the customer field of the record.

A3: Compute the intersection of sets.

The SQL set-lization is incomplete and can only be used to represent the simple result set. Developers cannot use SQL to represent the concept of “set of set”. Only the queries of 3-level-nested-loops are available to barely perform the similar computations. In addition, SQL cannot be used to perform the intersection operation easily that developers with advanced techniques can only resort to the unreadable statements to perform the similar operations, such as “count of years equal to the count of clients”. It equals to compute the intersection of client sets.
The set is the base of massive data. esProc can achieve set-lization completely, represent the set, member, and other related generic or object reference conveniently, and perform the set operations easily, such as intersection, complement, and union.

When analyzing the set-related data, esProc can greatly reduce the computation complexity. By taking the advantage of set, esProc can solve many problems agilely and easily that are hard to solve with SQL.

The More Convenient Data Computing Scripts II

FAVOR (0)
Leave a Reply
Cancel
Icon

Hi,You need to fill in the Username and Email!

  • Username (*)
  • Email (*)
  • Website