0

In a csv received, the first line is the header with columns, and the last line is a custom trailer record that just notes the number of records, file names etc.

We would like to be able to load this large file into sqlite to eyeball some of the columns since excel struggles with this. Also, tools like UltraEdit can open this file, but it is a pain to check certain values in columns.

Can SQLite skip this last line, or should the last line be trimmed before we load into SQLite. We would prefer not to change this file.

Environment: Windows.

jackrabb1t
  • 14,609
  • 1
  • 21
  • 20
  • Recent versions of the sqlite3.exe tool can skip *leading* lines when importing CSV files, but not trailing. I bet it'd be easy to whip up a Powershell script to do that, though, and saving a temporary file which is then imported. – Shawn Apr 08 '22 at 15:50
  • I would process your file(s) before. A simple tool like `awk` should be able to crunch your file extremely fast, skipping first line and last line. It [has been answered here](https://stackoverflow.com/questions/15856733/what-is-the-easiest-way-to-remove-1st-and-last-line-from-file-with-awk). You can probably output AWK directly to SQLite. – MyICQ Apr 08 '22 at 20:31
  • @Shawn : i thought of that, if i am creating a temporary file, i am adding the chance of file corruption, that i was looking to avoid if possible,. – jackrabb1t Apr 09 '22 at 12:22
  • @MyICQ : also, i dont want to modify existing file, the current file is a few gb. the sqlite is just for a spot check in this usecase. Can SQLite import into its db, but ignore the last line because it has only 3 columns instead of the rest of the lines that have 20 columns? – jackrabb1t Apr 09 '22 at 12:24
  • If I use comma for the data in my file and "|" for the trailer record, will it be of help? Some way for SQLite to ignore that line, yet not modify the file? – jackrabb1t Apr 09 '22 at 12:25

1 Answers1

1

You can use the SQLite CLI program to import your CSV. In its simplest form:

sqlite3 -separator ; test.db ".import test.csv test_table"

and with a test.csv like this:

A;B;C
a;b;c
d;e;f
g;h;i
summary line

the import will complete, and only produce a warning:

test.csv:5: expected 3 columns but found 1 - filling the rest with NULL

The result is this test_table:

A B C
a b c
d e f
g h i
summary NULL NULL

It would be easy to eliminate the last record with a DELETE statement.


The .import dot-command will interpret its argument as a filename by default, but it will treat it as a shell command when it starts with a pipe character |.

This means you can pre-process the CSV file to eliminate the summary line, for example with findstr -v:

sqlite3 -separator ; test.db ".import '|findstr -v summary < test.csv' test_table"

You could even make a multi-line init script comprising dot commands and SQL, which sets up your database and massages your input data into shape. This is more flexible than cramming it all onto the command line.

sqlite3 -init import_script.txt test.db
Tomalak
  • 332,285
  • 67
  • 532
  • 628