-1

I'm trying to think of the best way to import csv data into rethinkdb while avoiding possible duplicates (eg importing the same file twice).

The csv data comes from bank statements. There is no real primary key, instead a composite key of date,description and amount can be used.

I wanted to use the CLI for imports but I couldn't see how I could do that given I don't have a single primary key.

Is the best way to iterate through the csv, first check for the existence and then insert if not found? I struggled to create a single query that would insert only if the check for existence was empty.

Any guidance - I was assuming this is a somewhat common scenario?

(I'm using JavaScript as my language)

Thanks in advance!

sub
  • 232
  • 2
  • 9

1 Answers1

2

You could set the conflict options on the insert statement to either "update" or "replace" This will check to see if a document with the same specicifed id already exists and either update or replace it. http://www.rethinkdb.com/api/javascript/insert/

r.table("users").insert(
    {id: "william", email: "william@rethinkdb.com"},
    {conflict: "replace"}
).run(conn, callback)
Samuel Goldenbaum
  • 18,391
  • 17
  • 66
  • 104