Issues related to the usage of DuckDB (www.duckdb.org)
Questions tagged [duckdb]
180 questions
2
votes
1 answer
DuckDB R : Calculate mean and median for multiple columns
I have a duckdb and want to calculate the means and median or multiple columns at once:
e.g.
#This works:
mtcars %>%
summarise(across(everything(),list(mean, median))
#This doesn't
tbl(con,"mtcars")%>%
summarise(across(everything(),list(mean,…

HCAI
- 2,213
- 8
- 33
- 65
2
votes
1 answer
UnsatisfiedLinkError for DuckDb native code in Java
When trying to open a connection to DuckDb on an EC2 instance:
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux…

Mathilda B
- 31
- 3
2
votes
1 answer
How does DuckDB handle Sparse tables?
We are evaluating embedding duckdb in our applications. We deal with a lot of tables where the columns will be around 60-70 % sparse most of the time. Does duckdb fill them with default null values or does it support sparsity internally?

user2286963
- 125
- 2
- 11
1
vote
1 answer
Filter based on a list column using arrow and duckdb
I'm using the R arrow package to interact with a duckdb table that contains a list column. My goal is to filter on the list column before collecting the results into memory. Can this be accomplished on a virtual duckdb table?
Example
library(arrow,…

davechilders
- 8,693
- 2
- 18
- 18
1
vote
1 answer
arrow::to_duckdb coerces int64 columns to doubles
arrow::to_duckdb() converts int64 columns to a double in the duckdb table. This happens if the .data being converted is an R data frame or a parquet file. How can I maintain the int64 data type?
Example
library(arrow, warn.conflicts =…

davechilders
- 8,693
- 2
- 18
- 18
1
vote
0 answers
SQLite Database File Invalidated from Query Being Interrupted (using DuckDB Python)
Connected to an SQLite DB file via DuckDB Python DB API in read_only mode. Ran a typical SELECT query, which was interrupted - I believe my python process was closed, I don't remember exactly.
Went to query the DB file again and got the following…

GlutenFreeJesus
- 11
- 1
1
vote
1 answer
Fast upsert into duckdb
I have a dataset where I need to upsert data (on conflict replace some value columns). As this is the bottleneck of an app, I want this to be fairly optimized. But duckdb is really slow compared to sqlite in this instance. What am I doing wrong…

David
- 9,216
- 4
- 45
- 78
1
vote
1 answer
How to increase row output limit in DuckDB in Python?
I'm working with DuckDB in Python (in a Jupyter Notebook). How can I force DuckDB to print all rows in the output rather than truncating rows? I've already increased output limits in the Jupyter Notebook.
This would be the equivalent of setting…

bbgatch
- 61
- 5
1
vote
2 answers
In duckdb-wasm, how to use a table after c.insertArrowTable?
I have an arrow table that I'm trying to query with DuckDB_wasm but I'm getting an error that the table doesn't exist. I have this...
const dlURL="http://localhost:7071/api/getdata"
const arrowTable = await tableFromIPC(fetch(dlURL))
const c =…

Dean MacGregor
- 11,847
- 9
- 34
- 72
1
vote
1 answer
Does DuckDB support multi-threading when performing joins?
Does DuckDB support multi-threaded joins? I've configured DuckDB to run on 48 threads, but when executing a simple join query, only one thread is actively working.
Here is an example using the CLI API:
# setting up database relations
CREATE TABLE R…

cmq
- 11
- 4
1
vote
1 answer
Duck DB Not implemented Error: Writing to HTTP files not implemented
Using duck db, I am trying to write a data frame (from my VS code) into a parquet (in Azure storage accounts). I am getting the error as Not implemented Error: Writing to HTTP files not implemented.
However, while forming the data frame (which I am…

Debottam
- 13
- 3
1
vote
0 answers
How do you specify column compression algorithm in duckdb?
I've read DuckDB Lightweight Compression and understand that DuckDB is designed to choose the best compression strategy automatically, but would like to know if it is possible to give hints in CREATE TABLE or ALTER TABLE statements to explicitly set…

Danny G
- 581
- 4
- 16
1
vote
1 answer
Can DuckDB be used as Document Database?
As far as I know, the DuckDB is columnar database and can process and store sparse data efficiently.
So, would it be possible to use it as "tuple space" or "document database"? I don't expect to get top performance from DuckDB in such use case, good…

Alex Craft
- 13,598
- 11
- 69
- 133
1
vote
1 answer
In DuckDB, how do I SELECT rows with a certain value in an array?
I've got a table with a field my_array VARCHAR[]. I'd like to run a SELECT query that returns rows where the value ('My Term') I'm searching for is in "my_array" one or more times.
These (and a bunch more I tried) don't work:
SELECT * FROM my_table…

Jason Norwood-Young
- 393
- 3
- 10
1
vote
2 answers
DuckDB slower than Polars in single table over + groupby context
For the following toy example which involves both calculations over window and groupby aggregations, DuckDB performs nearly 3x slower than Polars in Python. Both give exactly the same results.
Is this kind of benchmarking result as expected, because…

lebesgue
- 837
- 4
- 13