Issues related to the usage of DuckDB (www.duckdb.org)
Questions tagged [duckdb]
180 questions
2
votes
1 answer
does duckDB create a copy of an R data frame when I register it?
I am trying to learn about using DuckDB in R. In my reading of the docs and what people say online, it sounds as if, when I register a data frame as a virtual table, no copy is made. Rather, a pointer is created that point to the data frame.
If I…

mac_in_texas
- 21
- 2
2
votes
1 answer
DuckDB - Rank correlation is much slower than regular correlation
Comparing the following two code sections with the only difference as the second one first computes rank, the second section results in much slower performance than the first one (~5x).
Although the second section involves a few more extra…

lebesgue
- 837
- 4
- 13
2
votes
1 answer
SQL/DuckDB: how to calculate spearman rank correlation by groups?
I want to calculate spearman (rank) correlation in a groupby context using DuckDB/SQL syntax. I tried the following, but failed.
import duckdb
import pandas as pd
df = pd.DataFrame(
{
"a": [1, 1, 2, 2, 6, 1, 3, 6, 3],
"b": [4,…

Keptain
- 147
- 7
2
votes
2 answers
Polars is much slower than DuckDB in conditional join + groupby/agg context
For the following example, where it involves a self conditional join and a subsequent groupby/aggregate operation. It turned out that in such case, DuckDB gives much better performance than Polars (~10x on a 32-core machine).
My questions are:
What…

lebesgue
- 837
- 4
- 13
2
votes
1 answer
How to alter data constraint in duckdb R
I am trying to alter a Not Null constraint to a Null constraint in duckdb (R api) and can't get it to stick. Here is an example of the problem.
drv<- duckdb()
con<- dbConnect(drv)
dbExecute(con, "CREATE TABLE db(a varchar(1) NOT NULL, b varchar(1)…

matto
- 77
- 7
2
votes
1 answer
How to show user schema in a Parquet file using DuckDB?
I am trying to use DuckDB to show the user-created schema that I have written into a Parquet file. I can demonstrate in Python (using the code example at Get schema of parquet file in Python) that the schema is as I desire, but cannot seem to find…

rbmales
- 143
- 1
- 8
2
votes
0 answers
Unsupported result column Struct()[] for DuckDB 0.7.1 from_json
I am trying to get a large set of nested JSON files to load into a table, each file is a single record and there are ~25k files. However when I try to declare the schema it errors out when trying to declare the data type if it is a struct. For…

Mitchell Hamann
- 313
- 4
- 18
2
votes
1 answer
How many threads is DuckDB using?
Using duckDB from within R, e.g.
library(duckdb)
dbname <- "sparsemat.duckdb"
con2 <- dbConnect(duckdb(), dbname)
dbExecute(con2, "PRAGMA memory_limit='1GB';")
how can I find out how many threads the (separate process) is using? I am aware…

Karsten W.
- 17,826
- 11
- 69
- 103
2
votes
1 answer
Querying last row of sorted column where value is less than specific amount from parquet file
I have a large parquet file where the data in one of the columns is sorted. A very simplified example is below.
X Y
0 1 Red
1 5 Blue
2 8 Green
3 12 Purple
4 15 Blue
5 17 Purple
I am interested in querying the last value…

jd0
- 23
- 3
2
votes
1 answer
Syntax for Duckdb > Python SQL with Parameter\Variable
I am working on a proof of concept, using Python and Duckdb.
I am wanting to use a variable\parameter inside the Duckdb SELECT statement.
For example,
y = 2
dk.query("SELECT * FROM DF WHERE x > y").to_df()
How can y be properly referenced?
I was…

Kent Culpepper
- 21
- 2
2
votes
1 answer
problem with reading partitioned parquet files created by Snowflake with pandas or arrow
ArrowInvalid: Unable to merge: Field X has incompatible types: string vs dictionary
ArrowInvalid: Unable to merge: Field X has incompatible types: decimal vs int32
I am trying to write the result of a…

Ehsan Fathi
- 598
- 5
- 21
2
votes
1 answer
Add columns to a table or records without duplicates in Duckdb
I have the following code:
import time
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler, PatternMatchingEventHandler
import duckdb
path = "landing/persistent/"
global con
con =…

Norhther
- 545
- 3
- 15
- 35
2
votes
1 answer
Does Duck DB support triggers?
I suspect the answer is no, but I just wanted to check if anyone has a way to implement triggers in DuckDB?
I have a SQLite database that relies heavily on views with INSTEAD OF INSERT/ UPDATE/ DELETE triggers to mask the underlying table structure…

David
- 21
- 1
2
votes
1 answer
DuckDB Not saving huge database
We are trying to embed duckdb in our project but DuckDB doesn't seem to be able to save database after closing connection.
Informations:
Database size: 16Go
Amount of tables: 3
I searched for information about data not persisting and found nothing…

xonturis
- 98
- 1
- 5
2
votes
0 answers
How to determine cause of "RuntimeError: Resource temporarily unavailable" error in Python notebook
In a hosted Python notebook, I'm using the duckdb library and running this code:
duckdb.connect(database=":memory:", read_only=False)
This returns the following error sometimes:
Traceback (most recent call last):
File…

JKillian
- 18,061
- 8
- 41
- 74