1

I'm starting to learn DuckDB (on Windows) and I'm having some problems and I don't find much information about it on the internet.

I'm following the following tutorial for beginners: https://marclamberti.com/blog/duckdb-getting-started-for-beginners/

Right at the beginning, I tried converting csv files to parquet with the following command:

import glob
PATH = 'stock_market_data/nasdaq'
for filename in glob.iglob(f'{PATH}/csv/*.csv'):
dest = f'{PATH}/parquet/{filename.split("/")[-1][:-4]}.parquet'
conn.execute(f"""
COPY (SELECT * FROM read_csv('{filename}', header=True, dateformat='%d-%m-%Y', columns={{'Date': 'DATE', 'Low': 'DOUBLE', 'Open': 'DOUBLE', 'Volume': 'BIGINT', 'High': 'DOUBLE', 'Close': 'DOUBLE', 'AdjustedClose': 'DOUBLE'}}, filename=True)) 
TO '{dest}' (FORMAT 'parquet')""")

Then I get the following error:


IOException                               Traceback (most recent call last)
Cell In[14], line 6
      4 for filename in glob.iglob(f'{PATH}/csv/*.csv'):
      5     dest = f'{PATH}/parquet/{filename.split("/")[-1][:-4]}.parquet'
----> 6     conn.execute(f"""COPY (SELECT * 
      7         FROM read_csv('{filename}', 
      8         header=True, 
      9         dateformat='%d-%m-%Y', 
     10         columns={{'Date': 'DATE', 
     11             'Low': 'DOUBLE', 
     12             'Open': 'DOUBLE', 
     13             'Volume': 'BIGINT', 
     14             'High': 'DOUBLE', 
     15             'Close': 'DOUBLE', 
     16             'AdjustedClose': 'DOUBLE'}}, 
     17         filename=True)) 
     18         TO '{dest}' (FORMAT 'parquet')""")

IOException: 

The error is just "IOException" and no further information is given.

I tried looking up the IOException error regarding DuckDB and found nothing even on the project's git page. Could someone help me or give me a direction of what this error could be?

Thanks in advance.

  • Why use some random tutorial from who knows where when you have the docs available. See [Data Ingestion](https://duckdb.org/docs/api/python/data_ingestion). I don't see that you are using the `duckdb` module. – Adrian Klaver Apr 03 '23 at 22:03

0 Answers0