There are two tab-separated structured text files. chr column in AssociatedMarkers.txt is the logical foreign key pointing to Chr column in DiseaseMarkers.txt. We want to create a new structured text file, in which one column comes from AssociatedMarkers.txt’s snps_BCG24 column and the other is a computed column that will get its values through the following algorithm: If a value of AssociatedMarkers.txt’s hg19pos column falls within the startLoc and endLoc in DiseaseMarkers.txt, then output it as inLocus; otherwise output it as an empty string. Selections of the two files are as follows:
AssociatedMarkers.txt
DiseaseMarkers.txt
esProc approach:
|
A |
1 |
=file(“/Users/Me/AssociatedMarkers.txt”).import@t() |
2 |
=file(“/Users/Me/DiseaseMarkers.txt”).import@t() |
3 |
=join(A1,chr;A2,Chr) |
4 |
=A3.new(_1.snps_BCG24,if(in(_1.hg19pos,_2.startLoc:_2.endLoc),”inLocus”)) |
A1,A2: Import the files into memory. @t means importing column names at the same time.
A3: Perform join operation. Result is as follows:
A4: Retrieve desired columns from A3. _1.hg19pos column corresponds AssociatedMarkers.txt’s hg19pos column. The final result is as follows: