Questions tagged [duckdb]

Issues related to the usage of DuckDB (www.duckdb.org)

180 questions
1
vote
1 answer

How can I use duckdb.read_json_auto in Python without creating a temporary file?

I have a simple function that inserts a Python dictionary into DuckDB. How can I insert it into my table without creating a temporary file? def save_to_duckdb(data): # Connect to the Duckdb database conn =…
Geo-7
  • 127
  • 9
1
vote
1 answer

Iterating on rows in pandas DataFrame to compute rolling sums and a calculation

I have a pandas DataFrame, I'm trying to (in pandas or DuckDB SQL) do the following on each iteration partitioned by CODE, DAY, and TIME: Iterate on each row to calculate the sum total of the 2 previous TRANSACTIONS or TRANSACTIONS_FORECAST values…
AK91
  • 671
  • 2
  • 13
  • 35
1
vote
3 answers

Trying to do a docker build which fails at chromadb installation

I am trying to build a docker image for my python flask project. Seems like there is some issue with the below packages on which Chromadb build is dependent duckdb, hnswlib Below are the contents of the docker file. FROM…
Jason
  • 676
  • 1
  • 12
  • 34
1
vote
1 answer

How to read a csv file from google storage using duckdb

I'm using duckdb version 0.8.0 I have a CSV file located in google storage gs://some_bucket/some_file.csv and want to load this using duckdb. In pandas I can do pd.read_csv("gs://some_bucket/some_file.csv"), but this doesn't seem to work in duckdb.…
baxx
  • 3,956
  • 6
  • 37
  • 75
1
vote
0 answers

Can you load a JSON object into a duckdb table with the Node.js API?

The duckdb Node.js API can load data from a JSON file. However, I don't see a way to load data from a JSON object, similar to the way duckdb Wasm ingestion works. Is there a way to do this without writing JSON to a file and then loading from the…
Andrew
  • 2,368
  • 1
  • 23
  • 30
1
vote
1 answer

DuckDB SQL Query ParserException: Error on executing SQL query with column name which includes # symbol

When I tried to execute a query on DuckDB which accesses parquet file from Azure Blob Storage. It is showing parse ParserException at column names PatientDxICD-10Code#01,PatientDxICD-10Code#02. Query and ParserException are given below. …
Divya Nair
  • 11
  • 1
1
vote
4 answers

chromadb.errors.NoIndexException: Index not found, please create an instance before querying

What does this mean? How can I load the following index? tree langchain/ langchain/ ├── chroma-collections.parquet ├── chroma-embeddings.parquet └── index ├── id_to_uuid_cfe8c4e5-8134-4f3d-a120-0510e189004f.pkl ├──…
jmunsch
  • 22,771
  • 11
  • 93
  • 114
1
vote
1 answer

SQL: Unpack STRUCT to columns

I have a jsonl file I've read into duckdb that looks like this: ('conversation_id', 'BIGINT', 'YES', None, None, None) ('text', 'VARCHAR', 'YES', None, None, None) ('meta', 'STRUCT(case_id VARCHAR, start_times DOUBLE[], stop_times DOUBLE[],…
AutomaticStatic
  • 1,661
  • 3
  • 21
  • 42
1
vote
0 answers

DuckDB beginner needs help: IOException error

I'm starting to learn DuckDB (on Windows) and I'm having some problems and I don't find much information about it on the internet. I'm following the following tutorial for beginners:…
1
vote
1 answer

Getting result from select query with dbt jinja

So I'm working with a duckdb database connected with dbt. Now I can execute my query and it can complete succesfully now the problem that I face is that I want to get the result from this query. My sql file looks like the following. {%- call…
david backx
  • 163
  • 1
  • 9
1
vote
2 answers

How to speed up processing of very large dataframe in python

I'm pretty new to working with very large dataframes (~550 million rows and 7 columns). I have raw data in the following format: df = Date|ID|Store|Brand|Category1|Category2|Age This dataframe is over 500 million rows and I need to pass it through a…
Kristina
  • 11
  • 2
1
vote
1 answer

How do I get a list of table-like objects visible to duckdb in a python session?

I like how duckdb lets me query DataFrames as if they were sql tables: df = pandas.read_parquet("my_data.parquet") con.query("select * from df limit 10").fetch_df() I also like how duckdb has metadata commands like SHOW TABLES;, like a real…
william_grisaitis
  • 5,170
  • 3
  • 33
  • 40
1
vote
1 answer

IMPORT and EXPORT in Duckdb due to change of version

I have been using duckdb and have a database but recently I updated duckdb and not able to use the duckdb and getting following error. duckdb.IOException: IO Error: Trying to read a database file with version number 39, but we can only read version…
Arun Kumar
  • 13
  • 4
1
vote
2 answers

DuckDB Binder Error: Referenced column not found in FROM clause

I am working in DuckDB in a database that I read from json. Here is the json: [{ "account": "abcde", "data": [ { "name": "hey", "amount":1, "flow":"INFLOW" }, { "name":…
charelf
  • 3,103
  • 4
  • 29
  • 51
1
vote
1 answer

[SQL]: Efficient sampling from cartesian join

I have two tables. What I want is a random sample from all the possible pairings. Say size of t1 is 100, and size of t2 is 200, and I want a sample of 300 pairings. The naive way of doing this (ran on the online duckdb shell) is: CREATE TABLE t1 as…
Nick Crews
  • 837
  • 10
  • 13