I deal a lot with text files, comparing one to another in a "SQL manner".
DBD::CSV
is obviously a good choice to start with, as I can use the power of SQL syntax on text tables. However, I deal with huge text files, making DBD::CSV
useless in terms of performance.
So, I was starting to write a module that converts csv files to sqlite DB, and then returns DBI::sqlite
object that I can play with. The thing is, that converting a text file to sqlite table can also be not very efficient, that because I can not run the sqlite command line from perl to load CSV files quickly (using .load). So I have to create a huge Insert into
string based on the text tables, and execute it (executing Insert line by line is very inefficient in terms of performance, so I preferred executing a one big insert). I'm willing to avoid that, looking for a one-liner to load csv to sqlite using perl.
And another thing, I used the following functions to execute and print nicely a SQL query:
sub sql_command {
my ($self,$str) = @_;
my $s = $self->{_db}->prepare($str) or die $!;
$s->execute() or die $!;
my $table;
push @$table, [ map { defined $_ ? $_ : "undef" } @{$s->{'NAME'}}];
while(my $row = $s->fetch) {
push @$table, [ map{ defined $_ ? $_ : "undef" }@$row ];
}
box_format($table);
return box_format($table);;
}
sub box_format {
my $table = shift;
my $n_cols = scalar @{$table->[0]};
my $tb = Text::Table->new(\'| ', '', (\' | ','')x($n_cols-1), \' |+');
$tb->load(@$table);
my $rule = $tb->rule(qw/- +/);
my @rows = $tb->body();
return $rule, shift @rows, $rule, @rows, $rule
if @rows;
}
The sql_command
sub takes about ~1min to execute(on 6.5 MB file), which In my opinion is way longer that I would expect it to be. Does anyone has a more efficient solution?
Thanks!