Here is the basic outline of my application (which connects to a SQlLite DB):
It scans a directory a creates a list of 8-15 different flat files that need to be read.
It identifies the flat file and chooses a generic prepared statement based on the file extension.
It then reads the flat file line by line and adds the prepared statement with the strings having been set to a batch. The batch is executed each 1,000 statements. Some of the files in question have 200,000 lines that need to be read.
Once all files have been inserted in the database (there is a different table for each file type), the program updates a certain column for each table to a common value.
The program creates a new file for each file type and extracts the information in the database to the new file.
Currently one run on a directory with about 9 very small files (less than 50 rows) and one very large file (more than 200,000 rows) takes about 1.5 minutes to run. I am hoping to get this faster.
A few first questions:
Should I close and open the database connection for each part of the program (loading, updating, extracting) or leave the connection open and pass it off to each different method.
Should I close the prepared statement after each file that is processed? Or just clear the parameters after each file and close it at the end of method (essentially after all the jobs are loaded)?
I am interested in any other comments about things that I should be focusing on the maximize the performance of this application.
Thanks.