Issues related to the usage of DuckDB (www.duckdb.org)
Questions tagged [duckdb]
180 questions
0
votes
1 answer
duckdb - aggregate string with a given separator
The standard aggregator makes coma separated list:
$ SELECT list_string_agg([1, 2, 'sdsd'])
'1,2,sdsd'
How can I make a smicolumn separated list or '/'-separated? Like '1;2;sdsd' or '1/2/sdsd'.

Dmitry Petrov
- 1,490
- 1
- 19
- 34
0
votes
1 answer
Create sequence if not exist in DuckDB
It seems like DuckDB does not support this:
CREATE SEQUENCE seq_personid START 1 IF NOT EXIST;
Postres has this feature https://www.postgresql.org/docs/current/sql-createsequence.html
What would be the best workaround if DuckDB does not support…

Dmitry Petrov
- 1,490
- 1
- 19
- 34
0
votes
1 answer
Best way of bulk inserting a list of records in DuckDB using DuckDB.NET
Context
I'm working with the awesome DuckDB.NET library and C# for
this example. I'm specifically working with the ADO.NET provider.
Problem
My database contains the following table:
CREATE TABLE tbl01 (
Id INTEGER,
TextListTest TEXT[],
…

johngreen283
- 47
- 4
0
votes
1 answer
Open DuckDB database using Python and DBeaver
I created a DuckDB file using the following Python code:
import duckdb
con = duckdb.connect(database=':memory:')
con = duckdb.connect(database='db.duckdb', read_only=False)
con.execute("CREATE TABLE items(item VARCHAR, value DECIMAL(10,2), count…

Ohumeronen
- 1,769
- 2
- 14
- 28
0
votes
2 answers
Is there any easier way to convert a string that looks like a list in DuckDB than regexp_split_to_array or string_split?
I am attempting to query a Pandas Dataframe with DuckDB that I materialize with read_sql_query.
Some of this data is stored in a JSON format and in the target column each value has a list of items - ["Value1", "Value2", "Valueetc"] that from the…

Constantine Kokkinos
- 15
- 5
0
votes
1 answer
Insert into duckdb table using dplyr
I'm working with duckdb in R and would like to insert a record using dplyr. I have included an example below of what I'm trying to achieve.
First, I load my libraries, create a toy dataset, then load it into a database.
# Load…

Dan
- 11,370
- 4
- 43
- 68
0
votes
0 answers
Query to find the average time (in minute) it takes users to make payment of their orders
As seen below, I have a table called trans. I'm looking for the average time (in minutes) it takes users to pay for their orders in the table.
trans
here is the code I tried:
import duckdb
con =…

Duta
- 1
0
votes
1 answer
How should I fix this code with RuntimeError: Catalog Error: Schema with name a does not exist!? (SQL)
I have 2 tables as shown below:
I want to find the Employee's Full name with Grad and Salary which order by City. Here is my code:
import duckdb
con = duckdb.connect(database=':memory:')
con.execute("""
select
a.CONCAT(FirstName , ' '…

Akbar Amelia
- 27
- 6
0
votes
1 answer
Query parquet file with DuckDB throws Runtime Error: "Payload value bigger than allowed. Corrupted file?"
I am using DuckDB to query a parquet file. The query is like:
SELECT * FROM 'yellowcab.parquet' limit 5;
DuckDB throws Runtime Error: Payload value bigger than allowed. Corrupted file?. And it doesn't seem to be an issue in the parquet file because…

Haojin
- 304
- 3
- 11
0
votes
2 answers
create a polars dataframe containing unique values from a set of CSVs
I have +3000 CSVs with +10 columns. What I need is to get all unique values from just two of these. I am able to read unique values in polars:
import polars as pl
df1 = pl.read_csv("test1.biobank.tsv.gz", sep='\t', dtype={"#chrom": pl.Utf8},…

darked89
- 332
- 1
- 2
- 17
0
votes
1 answer
Extract rows with duplicate values in two or more fields but different values in another field - with SQL
This is a follow-up from the thread Extract rows with duplicate values in two or more fields but different values in another field
and the goal is to do the same thing for bigger-than-RAM data with SQL with DuckDB. The dupKeyEx() function code in…

San
- 518
- 5
- 14
0
votes
2 answers
Measure time performance of ResultSet from SQL queries
I have some issues regarding the understanding of "ResultSet". If I want to measure the performance time it take to execute the query, do I need to iterate through the Resultset --> while(rs.next()), since the actual result set includes already has…

dave8s
- 13
- 2
0
votes
0 answers
unexpected error in dbplyr dispatch of tidyr::expand() when using multiple columns
The following example from the docs of tidyr::expand() works as expected on a local, in-memory data.frame. The same example works on a tbl_dbi object when called on a single column, but displays an SQL parsing error when applied to multiple…

cboettig
- 12,377
- 13
- 70
- 113
0
votes
1 answer
Insert Python DuckDB table into SQL statement
I am trying to use a registered virtual table as a table in a SQL statement using a connection to another database. I can't just turn the column into a string and use that, I need the table/dataframe itself to work in the statement and join with…

J. Alexander
- 53
- 1
- 8
0
votes
1 answer
How do I insert Infinity values into REAL/FLOAT4 Columns in DuckDB?
Say I have a table called table_1 with a single column called col_1 that is of type REAL/FLOAT4 in DuckDB. How do I insert special float values like -Infinity, NaN, and Infinity into col_1? I have tried inserting those values many different ways…

Hawk
- 3
- 3