esProc Program: Sequence and Table Sequence(I)

Sequence, table sequence and record sequence are commonest data types of esProc. This article tries to expound their respective characteristics as well as relations between them.

1. A sequence is an ordered generic set

1.1 Collectivity

A sequence consists of multiple data, which is called members of the sequence. The members can be any type of data, such as string, integer, float and date, or empty. A sequence has the general characteristics of a set, and can perform set operations. For example:

	A
1	[]
2	[5,6,7]
3	[red,blue,yellow]
4	=[“blue”,”yellow”,”white”]
5	=A3^A4

In the above cellset, the values of A1, A2 and A3 are as follows:

esProc_program_seq&Tseq1_2

The contents in A1, A2 and A3 are sequences. There is an empty sequence in A1. In A2, the members of the sequence are intergers, thus it is also called an integer sequence. The sequence in A3 has string members; it is also the value of the cell.

In A4, an expression is used to compute a sequence. Different from the constant sequence in A3, the strings in the expression should be placed in double quotes. A5 computes the intersection of the sequence in A3 and the one in A4, with a result being a sequence too. The results in A4 and A5 are as follows:

esProc_program_seq&Tseq1_3

1.2 Genericity

A sequence is a generic set, which is allowed to have members of various data types. A member can also be a sequence. For example:

	A	B
1	[]	=[1,date(“2014-05-01”)]
2	[5,6,7]	=[“blue”,[],[5,6,7]]
3	[red,blue,yellow]	=[“blue”,A1,A2]
4	=[“blue”,”yellow”,”white”]
5	=A3^A4

In the cellset, members of the sequence in B1 are an integer and a date; members of the sequence in B2, which equals the expression in B3, contain a sequence. The results in B1, B2 and B3 are as follows:

esProc_program_seq&Tseq1_5

You can click the sequence members, which are displayed in blue, and see the details.

1.3. Orderliness

Generally, a set is unordered, that is, two sets with same members of different order are equal. While a sequence is ordered, which means two sequences with same members of different order are not equal. For example:

	A	B
1	[Mike,Tom]	[Tom,Mike]
2	=A1==B1

In A2, expression =A1==B1 is used to judge if the two sequences are equal. The result is false:

esProc_program_seq&Tseq1_7

Orderliness is a common feature of business data. For example, that Mike comes before Tom may mean that Mike has done a better job in school study; sorting monthly sales can clearly present its changing rule. It is more convenient to use a sequence to perform ordered computation. For example:

	A
1	[Mike,Tom]
2	=A1(2)
3	=A1.m(-1)
4	=A1.pos(“Tom”)
5	=A1.rvs()

A2 fetches the second member of the sequence. The operation can also be expressed as =A1.m(2). A3 fetches the last member of the sequence. A4 gets the sequence number of the member Tom from the sequence. A5 reverses the sequence. The results in A2, A3, A4 and A5 are as follows:

esProc_program_seq&Tseq1_9

In addition, there are operations like insert, delete, modify, copy, compare, aggregate, sub-sequence, sort, rank, sets computation, mutual transformation of strings and sequences, etc.

An integer sequence is a sequence whose members are integers. It has a more detailed access method. For example:

	A
1	=to(2,5)

The sequence generated in A1 is as follows:

esProc_program_seq&Tseq1_11

If the sequence begins with 1, such as =to(1,5), it can be abbreviated to =to(5).

2. A table sequence is a structured sequence

esProc inherits the data table concept of relational database, and defines it as table sequence. Consistent with the concept of relational database, every table sequence also has its own data structure, which consists of several fields. Members of a table sequence are called as records.

2.1 Structured two-dimensional data objects

Members of a sequence can be any types of data, such as ordinary types, sequences or records; while members of a table sequence must be records of the same structure. For example, the data objects in the following figure constitute a table sequence:

esProc_program_seq&Tseq1_12

Because a table sequence is a structured two-dimensional data object, it is usually created from SQL, text files, binary files, Excel files or an empty table sequence. A1, B1 and C1 in the following are table sequences:

	A
1	=file(“Order_Books.txt”).import@t()
2	=demo.query(“select * from CITIES”)
3	=create(OrderID,Client,SellerId,Amount,OrderDate)

A1 creates a table sequence, which is the one in the above, from a text file. B1 creates a table sequence using SQL, and C1 creates an empty table sequence by specifying the field names. The data in B1 and C1 is as follows:

esProc_program_seq&Tseq1_14

A great deal of structured data operations, such as query, sort, sum, average value, merging repeated records, can be performed with a table sequence. For example:

	A
1	=file(“Order_Books.txt”).import@t()
2	=A1.select(Amount>=20000 && month(Date)==5)
3	=A1.sort(SalesID,Date:-1)
4	=A1.groups(SalesID, month(Date); sum(Amount), count(~))

A2 selects records whose Amount is greater than or equal to 20,000 and Date is the month of May:

esProc_program_seq&Tseq1_16

A3 sorts the records by SalesID in ascending order. Records with the same SalesID will be sorted by Date in descending order:

esProc_program_seq&Tseq1_17

A4 sums up Amount of each group of data and count up orders of each group according to SalesID and the month:

esProc_program_seq&Tseq1_18

2.2 A table sequence is a special sequence

A table sequence is still a sequence. The collectivity, orderliness and functions of a sequence apply in a table sequence. A table sequence hasn’t the feature of genericity in the sense of a sequence because its members must be records of the same structure. But, the field values of the records can be generic data, which in this sense is another form of genericity. Thanks to these features, a table sequence is better at handling complicated computation than traditional programming language does.

For example, based on orderliness, we can compute the growth rate of each month compared with the previous one with a table sequence. The approach is as follows:

	A
1	=file(“Order_Books.txt”).import@t()
2	=A1.groups(month(Date); sum(decimal(string(Amount))):TotalAmount, count(~):Count)
3	=A2.derive(round(TotalAmount/TotalAmount[-1]-1,4): compValue)

A2 computes the total sales of each month:

esProc_program_seq&Tseq1_20

A3 computes the final result:

esProc_program_seq&Tseq1_21

Another instance based on collectivity. Assume that a big contract in business is the one whose quantity ordered is greater than 1,000 and an important contract is the one in which each amount of order is more than 10,000. Please select: 1. Contracts that are both big and important; 2. The other contracts:

	A	B
1	=file(“Order_Books.txt”).import@t()
2	Big	=A1.select(Quantity>1000)
3	Importance	=A1.select(Amount>10000)
4	=B2^B3	=A1\A4

B2 selects the big contracts, and then B3 selects the important ones. A4 computes the common members of the two kinds of contracts, which are the answer to question 1:

esProc_program_seq&Tseq1_23

B4 answers question 2 by computing the complement of all orders and the results of question 1:

esProc_program_seq&Tseq1_24

Note: In the above code, A1 contains a table sequence, while, in B2, B3, A4 and B4, there are all record sequences originating from table sequences. The difference and relation between a table sequence and a record sequence will be explained in the following.