Related Computing in esProc– Foreign Key and Relationship between Tables

In relational databases, a foreign key is often used to identify the relationship between tables. In esProc, this relationship of correspondence can be expressed by the foreign key field.

1. Computed columns and foreign key fields

T.derive() function is used in esProc to add computed columns to a table sequence, for example:

	A
1	=demo.query(“select EID, NAME, SURNAME, BIRTHDAY from EMPLOYEE”)
2	=A1.derive(NAME+” “+SURNAME: FULLNAME, age(BIRTHDAY):AGE)

A1 gets employee information. A2 adds computed columns – FULLNAME and AGE -to A1’s table sequence, returns a new table sequence, and computes the employees’ full names and ages.

The following is the table sequence in A1:

esProc_related_foreignkey_2

After the computed columns are added, A2 gets a new table sequence as follows:

esProc_related_foreignkey_3

Having known about how to add computed columns to a table sequence, we’ll explore the link between the computed columns and the relationship between tables.

Usually, there are relationships between tables in a database. In esProc the relationship is expressed by using the records references to represent data of a table sequence, which makes the data search and presentation simpler and the data structure clearer.

Using T.derive() function, we can add one or more computed columns to a table sequence so that the records or the record sequence in another table can be referenced in the data type of these columns, thus a single or more foreign key fields are created to identify the relationship between the two tables. For example:

	A
1	=demo.query(“select * from CITIES”)
2	=demo.query(“select STATEID, NAME, ABBR, CAPITAL from STATES”)
3	=A1.derive(A2.select@1(STATEID==A1.STATEID):State)
4	=A1.derive(State.ABBR:SA)
5	=A2.derive(A1.select(STATEID==A2.STATEID):Cities)

A1 and A2 retrieve the data from database tables – STATES and CITIES -respectively:

esProc_related_foreignkey_5

CITIES is related to STATES through STATEID field. This kind of storage pattern in databases can keep data consistency, make data easy to maintain and, at the same time, save storage spaces.

A3 adds State field to CITIES to create a foreign key for storing the records of states in which cities located. A4 adds SA field to CITIES to list abbreviations of these states, for the convenience of being looked up. A5 adds Cities field to STATES as a foreign key for storing records of cities of every state.

Execute the program and the data of A4 are as follows:

esProc_related_foreignkey_6

In the above figure, State field contains records of states information. You can double-click it to see details.

Data of A5 are as follows:

esProc_related_foreignkey_7

Cities field contains records of cities of each state, double-click it to see details.

It can be seen that, through foreign keys, data of both A4 and A5 have obtained all information of the two tables: STATES and CITIES, in the original database, thus a relationship between both tables has been established. Note that the data types of these foreign key fields are different. The data of foreign key State in A4 are records while those of the foreign key Cities in A5 are record sequences, i.e. a sequence of records.

Actually the STATEID field in table STATES is usually the primary key. switch function can be used to relate a second table, like CITIES, to STATES through the primary key. For the use of the function, please refer to Related computing in esProc – Primary Key and Index Function.

2. The usage of foreign key fields

During querying or presenting a table sequence, the usage of foreign key fields is the same as that of common fields. But we should be careful towards the data types while using them.

For example, select from CITIES the states whose names contain “la”. Here you can directly call the NAME field of the foreign key State in the filtering expression to filter data:

	A
1	=demo.query(“select * from CITIES”)
2	=demo.query(“select STATEID, NAME, ABBR, CAPITAL from STATES”)
3	=A1.derive(A2.select@1(STATEID==A1.STATEID):State)
4	=A1.derive(State.ABBR:SA)
5	=A2.derive(A1.select(STATEID==A2.STATEID):Cities)
6	=A4.select(like(State.NAME,”la”))

Note that here the data type of the foreign key field is records. Computed results of A6 are as follows:

esProc_related_foreignkey_9

Foreign key fields can also be used as sorting criterion. For example, sort the states data in descending order according to the number of cities of the current state:

	A
1	=demo.query(“select * from CITIES”)
2	=demo.query(“select STATEID, NAME, ABBR, CAPITAL from STATES”)
3	=A1.derive(A2.select@1(STATEID==A1.STATEID):State)
4	=A1.derive(State.ABBR:SA)
5	=A2.derive(A1.select(STATEID==A2.STATEID):Cities)
6	=A5.sort(Cities.len():-1)

Note that here the data type of the foreign key field is record sequence. Sorting results of A6 are as follows:

esProc_related_foreignkey_11

If the records referenced by a certain foreign key field of a table sequence still contain foreign key fields, these records can be referenced again. For example, list the cities that the states in which they are located happen to contain three cities:

	A
1	=demo.query(“select * from CITIES”)
2	=demo.query(“select STATEID, NAME, ABBR, CAPITAL from STATES”)
3	=A2.derive(A1.select(STATEID==A2.STATEID):Cities)
4	=A1.derive(A3.select@1(STATEID==A1.STATEID):State)
5	=A4.select(State.Cities.len()==3)

Here A4 uses states information in A3 containing foreign key fields, instead of the original states information while adding states information to the CITIES. In this way, results we need can be seen in A5:

esProc_related_foreignkey_13

3.Join operations

Another type of relationship between tables is the join operations in SQL. esProc’s join function can realize the similar operations. However, with the methods for referencing records and record sequences as well as the align function, and etc., the operations become less used.

3.1.Equal join

The foreign key matches a field in a table with the record references in a second one based on their equivalence, while the join realized with join function matches record references from two or more tables according to the rule of equivalence. For example:

	A
1	=demo.query(“select STATEID,NAME,POPULATION,ABBR from STATES”)
2	=demo.query(“select * from CITIES”)
3	=A1.select(NAME>”C”)
4	=A2.groups(STATEID:ID;count(~):Count)
5	=join(A3:StateInfo,STATEID;A4:CityCount,ID)
6	=join@1(A3:StateInfo,STATEID;A4:CityCount,ID)
7	=join@f(A3:StateInfo,STATEID;A4:CityCount,ID)

A3 selects the states whose spellings begin with letter C or letters after it:

esProc_related_foreignkey_15

A4 counts the number of the big cities in each state according to the data of CITIES – the table of big cities:

esProc_related_foreignkey_16

Note:A3 doesn’t include those states whose names begin with letter “A” and letter “B” and STATEID has eligible data only greater than or equal to 6. Not all states have the records of big cities, for in A4 there are no statistics of states whose IDs are 4, 7,8 and etc.

A5 joins the records in A3 and A4 together according to their respective STATEID. Result is as follows:

esProc_related_foreignkey_17

It can be seen that, by default, the result of join using join function only include the records of the two tables whose STATEIDs match. The result can be regarded as a table sequence composed of foreign keys fields, each of which is the record references.

Generally the operation of join function amounts to the following code in SQL:

– select A_i.* as F_i,… from A_i,… where A_i.x_j =… and …

That is the most frequently used code for join operations in SQL.

There are other types of joins. For example, join@1 in A6 represents left join. Please note the “1” in option @1 uses the number 1 instead of the lowercase letter l, which is not used in esProc in case confusion is caused. Result of A6 is as follows:

esProc_related_foreignkey_18

At this point, all the state information in A3 have got into the result, though there are no corresponding statistics for some of the states in A4, such as those states whose IDs are 7, 8 and etc.

The join@f in A7 represents the full join. Result of A7 is as follows:

esProc_related_foreignkey_19

Now all the state information of A3 and A4 has been returned to the result. Still, some of the states haven’t the corresponding statistics in A3, such as those states whose IDs are 1, 2, 3 and etc. Some cannot find the corresponding statistics in A4, such as the states with IDs being 7, 8, and etc.

3.2.Alignment Join

It is based on the equivalence that join function works to join record references of different table sequences. Besides, table sequences can be joined in alignment. In that case, pjoin– the function for joining records in alignment- will be used. For example:

	A
1	=demo.query(“select STATEID,NAME,POPULATION,ABBR from STATES order by STATEID”)
2	=demo.query(“select * from CITIES”)
3	=A2.groups(STATEID:ID;count(~):Count)
4	=pjoin(A1:StateInfo,A3:CityCount)

A3 counts the number of big cities in each state according to the data of CITIES – the table of big cities. Notice that it lacks the statistics of some states:

esProc_related_foreignkey_21

A4 joins all the state information in A1 with the statistical results in A3 in alignment, and gets the following result:

esProc_related_foreignkey_22

The number of records in the result of this join is determined by the table sequence involved in the operation with the shortest data length. This join merely joins the records of two table sequences according to their positions. If not all the records to be joined match, you’ll get a result with not all data being related correctly, like the above result, in which the data of the state of Arkansas, whose ID is 4, is joined mistakenly to the statistic of the city whose ID is 5.

In fact, pjoin is only used to join tables that have the same data length and order. For example:

	A
1	=demo.query(“select * from STATENAME”)
2	=demo.query(“select * from STATECAPITAL”)
3	=demo.query(“select * from STATEINFO”)
4	=pjoin(A1:Name,A2:Capital,A3:Info)

The data in A1, A2 and A3 are respectively a part of all the state information and have the same order:

esProc_related_foreignkey_24

In this case, they can be joined together using pjoin, like what A4 does:

esProc_related_foreignkey_25

Thus we’ve got a correct result. For the situation in the previous example, the records from the two tables can be joined in alignment using align function.

3.3. Cross join

A cross join is a JOIN operation that combines each member from the first sequence with each member from the second (or up to nth) sequence. It is the most basic JOIN operation. Without any filtering condition, it will list all possible combinations. xjoin function can be used to realize the cross join operation. For example:

	A
1	[1,2,3,4]
2	[a,b,c]
3	=xjoin(A1:Number;A2:Letter)

Result of A3 is as follows:

esProc_related_foreignkey_27

It can be seen that the records in the resulting table sequence are all the possible combinations of the numbers and the letters.

xjoin function can also be used to execute the cross join on records in a table sequence or a record sequence. The records can be filtered according to one or more specified conditions. For example:

	A
1	=demo.query(“select * from STATES”)
2	[M,F]
3	=demo.query(“select * from EMPLOYEE”)
4	=xjoin(A1:State;A2:Gender;A3:Employee,STATE==State.NAME&&GENDER==Gender)

A1 selects records of all the states. A2 specifies a sequence of genders. A3 selects records of all the employees. In A4, only records of specified states and genders are selected according to the filtering conditions from the table sequence in A3 for performing the cross join operation. Result of A4 is as follows:

esProc_related_foreignkey_29