-1

I'm trying to read a large table from Oracle database and save it as local csv file in Python 3. Here's my code:

import cx_Oracle
import pandas as pd

user = 'me'
password = 'password'
dsn = 'dsn'
con = cx_Oracle.connect(user, password, dsn)

for chunk in pd.read_sql("select col_a, col_d, col_s from my_table", con, chunksize=10**4):
    chunk.to_csv(r"my_path\my_file.csv", index = false)

However, given the table has 200k+ rows, and I'm selecting a dozen columns out of 80+ columns, the performance of the code above is literally crawling.

Is there a faster way to read the table and save it as csv?

ad absurdum
  • 19,498
  • 5
  • 37
  • 60
ChangeMyName
  • 7,018
  • 14
  • 56
  • 93
  • 2
    why involve pandas at all? Almost certainly, the DBMS has an efficient way to export the results of a query to a CSV, involving python, especially pandas, isn't going to make things fast. – juanpa.arrivillaga Feb 19 '21 at 20:16
  • Presumably you shouldn't do file I/O in every iteration of the loop, only once at the end. – mkrieger1 Feb 19 '21 at 20:19
  • Don't have much experience with oracle, but this looks promising: https://oracle-base.com/articles/9i/generating-csv-files – juanpa.arrivillaga Feb 19 '21 at 20:20
  • 200k+ rows is actually not large. I query a table with 16 million+ rows using Python and pandas every day. The typical query completes in less than a second. If you have access to the table make sure it has an appropriate index for the fields you are querying. – Ethan Feb 19 '21 at 20:22
  • you could give [dask](https://stackoverflow.com/questions/57469457/how-to-load-subset-of-large-oracle-table-into-dask-dataframe) a try – RJ Adriaansen Feb 19 '21 at 21:12
  • For the DB side of the problem, make sure a big arraysize is being passed down to cx_Oracle, see https://cx-oracle.readthedocs.io/en/latest/user_guide/tuning.html#tuning-fetch-performance – Christopher Jones Feb 22 '21 at 02:16

1 Answers1

0

Pandas is a tool which does better functions than Excel.

When you work with LargeData, you need the database connection from Python.

The answer to your question, which one is your application. Search in https://pypi.org/ And install some pip package as ...SQL...

Depending on the package it will give the sentences of SQL. Sometimes you can write as SQL in different methods, and in others, you will have simplified methods.

In my case I used SQLAlchemy.

Regards.

Vahram Danielyan
  • 187
  • 3
  • 11