In daily work we have frequent use of text data or spreadsheets. We need to import data into database for further statistical analytics. For this task, esProc is a very handy tool.
In the following example, we will import Excel data into an Access database, to demonstrate how to migrate text data into database with esProc.
In the directory “D:\files\BoxOffice” we stored some box office data for movies. They are Excel files with the extension of “xlsx”. The first row of the Excel files contains field names. Below is the file 2010 Yearly Box Office.xlsx:
Now we need to store each file in Access database, for better data analytics.
Within esProc we need to establish a data connection to Access first, through system ODBC data source:
The code is as follows for importing Excel data into Access database:
|4||=upper(replace(C3,” “, “_”))||>B3.delete(B3.pselect@a(#1==null))|
|5||=”create table “+B4+ “(“||=B4|
|8||if D6==null && left(C7,1) ==”_”||next|
|10||>B5+=C7+” “||>C5+=”,”+C7+”:#”+ string(B6)|
|11||if ifnumber(D6) && D6==int(D6)||>B5+=”Long”|
|12||else if ifnumber(D6)||>B5+=”Float”|
|13||else if ifdate(D6)||>B5+=”Date”|
|14||else if iftime(D6)||>B5+=”Time”|
Connect to Access data source, access, and then we need to find the list of files to be imported in A2. The files are as follows:
Once we get the data file list, we can then run a loop to read the data from each Excel file and import it to Access. The loop begins from A3.
When processing each file, B3 first reads the datasheet, and C3 uses its file name (without file extension) as the name for the table in Access database. In order to make the data and table names more in line with the standards for database, B4 replaces spaces in a table name by “_”, and C4 deletes possible empty records from data sheets. Then, according to the structure and content of data sheets, table creation statements will be generated in B5, and data update statements will get ready to be generated in C5. Starting from line 6, loop through every field. When generating the statements, import field names and data in the first row, and determine the data types.
Line 17 executes table creation statements. At this point, we must consider the situation where a table with the same name could already exist in the database. If this is true, A1.rollback () must be executed to do a rollback.
Line 19 imports the detailed data from datasheets to Access database. When loops exit, a list data sheets can be imported into the database. For example, the data imported from the file 2010 Yearly Box Office.xlsx is as follows:
When the loop is done, we need to close the data source in A20 to avoid the existence of too many connections:
From this example, we learned how to meet complicated requirements in esProc with simple code. In particular, the same approach works not only for Excel files and Access database, but also for txt/xml files and various databases.
There are also some other approaches to import text data into the database. For example, the most primitive method is manual data input, which is obviously time consuming, laborious, boring and error prone. Programmers often solve these problems by coding. However, coding with common high-level languages (Java, C#) or scripting languages (Perl, Python) means a lot of workload, and is quite difficult to complete. Of course, Excel and Access are all Microsoft Office products. You can import Excel data directly from inside Access, with Excel as an external source. However, every time you can only import a single file, which makes it too troublesome when there are too many files to be imported. Also, this only works for Access database. In contrast, when you need to import batch text data into database, esProc is a nice tool.