1

Approaches being considered include:

  • sqlldr
  • external tables
  • Ab Initio
  • Goldgengate.

Are any of these approaches able to meet the 5bil rows in 6 hours requirement?

incoming is text files in pipe delim format. 1 file per table. 10 tables with between 10 and 20 columns each.

Alan Jurgensen
  • 813
  • 11
  • 20
  • 1
    depends on the amount of columns – mechanical_meat Apr 24 '13 at 19:34
  • what is the format of the incoming records? (i mean text file, xml, binary, other?) – Randy Apr 24 '13 at 19:35
  • also - to double check - you need to load 230,000 records per second for 6 hours each night? – Randy Apr 24 '13 at 19:39
  • Yes the avg throughput comes out as such: – Alan Jurgensen Apr 24 '13 at 20:00
  • 5000000000/6/60/60 = 231481.481 /sec – Alan Jurgensen Apr 24 '13 at 20:00
  • can you even parse the source files that fast? – Randy Apr 24 '13 at 20:17
  • If the data is available before the window: Load the data into a separate Oracle instance as it is available then during your nightly window use data pump to move it over a database link. – Brian Apr 24 '13 at 22:33
  • 1
    External tables and some serious hardware should be able to do it, see [here](http://structureddata.org/2010/04/23/the-core-performance-fundamentals-of-oracle-data-warehousing-data-loading/) for an example. – Jon Heller Apr 25 '13 at 05:47
  • 1
    The first check would be to measure the maximum bandwidth of your i/o subsystem. You have to be able to both read and write roughly (five billion * average_row_size) bytes simultaneously in your six hour window. What else do you need to do? Add any indexes, for instance? Unless the hardware exceeds the simple i/o requirement, you're not going to be able to do this. Also, what is your version and what extras do you have licensed? Partitioning? Advanced Compression? – David Aldridge Apr 25 '13 at 10:12
  • There might be an issue loading more than 4.2 bil rows with sqlloader. http://stackoverflow.com/questions/7103260/sqlloader-stuck-after-loading-4-2-billion-records But using direct load, if your table does not have a lot of columns, it's blazing fast. – Wouter Sep 23 '13 at 10:17

0 Answers0