0

I am on java project to import huge amount of data from .csv file to a database. I am interested in understanding what can be the best approach in achieving this.

  • Definitely one of the options is to using java application calling stored procedure.
  • Second option I can think of is, since we are already using spring, spring-jdbc pair can help us too.
  • Currently we are using spring-hibernate pair to get this done at a application level (This is something I presume is not a right approach)

Can you please help me with some thought from other end of spectrum?

Antony
  • 1,608
  • 1
  • 20
  • 27

2 Answers2

0

Best option is to use native support of the DB while doing bulk operations with huge data. If Oracle then SQL*Loader. If Postgres then they have the COPY command.

If you are looking for Java specific options then below is my preference order

  1. JDBC: use batch operations support but this has a limitation that any failure in the batch operation will short-circuit the entire flow

  2. Hibernate: ORMs are not meant for this. However, you can use StatelessSession and batch configuration together to achieve optimal performance.

Aravind Yarram
  • 78,777
  • 46
  • 231
  • 327
  • currently we are using postgres database. Is there tools like SqlLoader in postgres database? – Antony Feb 15 '11 at 06:19
0

In my opinion , such cases (bulk import) should be addressed using database features:

In case of Oracle SQLLoader (as suggested by @Pangea)

In case of MS SQL Server BCP (Bulk Copy)

If you are looking @ Java based approach for this then I echo @Pangea In addition to to that You can break down a batch insert into sub-batches and run them concurrently for better perf.

Ex: If you have 10k records to be inserted then you can build batches of 200 records each and insert 5 batches concurrently.

In this case you need code to track each sub-batch.

Hope this helps!

Dhananjay
  • 3,903
  • 2
  • 29
  • 44