0

I have the official Postgres ODBC drivers installed and am using IBM SPSS to try and load 4 million records from a MS SQL data source. I have the option set to bulk load via ODBC, but the performance is REALLY slow. When I go SQL-SQL the performance is good, when I go Postgres-Postgres the performance is good, but when I try and go SQL-Postgres it takes about 2.5 hours to load the records.

It's almost as if it's not bulk loading at all. Looking at the output it seems like it's reading the batched record count from the source very quickly (10,000 records), but the insert into the postgres side is taking forever. When I look at the record count every few seconds it jumps from 0 to 10,000 but takes minutes to get there, whereas it should be seconds.

Interestingly I downloaded a third party driver from DevArt and the load went from 2.5 hours to 9 minutes. Still not super quick, but much better. Either Postgres ODBC does not support bulk load (unlikely since postgres to postgres loads so quickly) or there's some configuration option at play in either the ODBC driver config or SPSS config.

Has anybody experienced this? I've been looking at options for the ODBC driver, but can't really see anything related to bulk loading.

raeldor
  • 503
  • 3
  • 11

1 Answers1

1

IBM SPSS Statistics uses the IBM SPSS Data Access Pack (SDAP). These are 3rd party drivers from Progress/Data Direct. I can't speak to performance using other ODBC drivers. But if you are using the IBM SPSS Data Access Pack "IBM SPSS OEM 7.1 PosgreSQL Wire Protocol" ODBC driver, then there are resources for you

The latest Release of the IBM SPSS Data Access Pack (SDAP) is version 8.0. It is available from Passport Advantage (where you would have downloaded your IBM SPSS Statistics Software) as "IBM SPSS Data Access Pack V8.0 Multiplatform English (CC0NQEN )"

Once installed, see the Help. On Windows it will be here: C:\ProgramData\Microsoft\Windows\Start Menu\Programs\IBM SPSS OEM Connect and ConnectXE for ODBC 8.0\

David_Dwyer
  • 166
  • 5