What is the best method to update small records from a big database?

Question

I have a database which contains all the data from a CSV that I downloaded from a botanic website, and I want to it always updated based on the version of this CSV. My problem is that the CSV is extremely huge, It contains at least 1 million records which takes on average an entire hour to save everything. Any ideas on how can I perform this update without the need to rewrite the entire thing? TYSM!

Some extra info:

It's a nodeJs project, and I'm using Prisma for the ORM
It's actually a txt file which I parse as a CSV since all the data is separated by tabs
I'm using posgresql
Currently I'm dropping the table and saving everything from the ground up

score 0 · Answer 1 · answered Apr 27 '22 at 03:13

0

You don't mention how you load the data into postgres, but you could use the COPY command, something like this:

COPY mytable FROM '/tmp/myfile.csv' (FORMAT csv, header TRUE);

Should be faster than using the ORM

https://www.postgresql.org/docs/current/sql-copy.html

Other option is to use file_fdw, if you just need to query the data, no updates

https://www.postgresql.org/docs/current/file-fdw.html

answered Apr 27 '22 at 03:13

Sven Knowles

1
1
3

Sorry for not informing it! I'm using csv parser, iterating over every line on the csv and saving it to the database using prisma.create. Thanks for the reply, I will check this out – lambadeltaz Apr 28 '22 at 00:27

score 0 · Accepted Answer · answered Apr 27 '22 at 14:45

You would need to essentially find differences between two csv versions and find the records which have changed and just update those records.

You could use any text/csv comparator npm library to find the difference, and once you get the records which has updates you could just initiate an update query for them.

What is the best method to update small records from a big database?

2 Answers2