Questions tagged [duckdb]

Issues related to the usage of DuckDB (www.duckdb.org)

180 questions
0
votes
1 answer

duckdb - aggregate string with a given separator

The standard aggregator makes coma separated list: $ SELECT list_string_agg([1, 2, 'sdsd']) '1,2,sdsd' How can I make a smicolumn separated list or '/'-separated? Like '1;2;sdsd' or '1/2/sdsd'.
Dmitry Petrov
  • 1,490
  • 1
  • 19
  • 34
0
votes
1 answer

Create sequence if not exist in DuckDB

It seems like DuckDB does not support this: CREATE SEQUENCE seq_personid START 1 IF NOT EXIST; Postres has this feature https://www.postgresql.org/docs/current/sql-createsequence.html What would be the best workaround if DuckDB does not support…
Dmitry Petrov
  • 1,490
  • 1
  • 19
  • 34
0
votes
1 answer

Best way of bulk inserting a list of records in DuckDB using DuckDB.NET

Context I'm working with the awesome DuckDB.NET library and C# for this example. I'm specifically working with the ADO.NET provider. Problem My database contains the following table: CREATE TABLE tbl01 ( Id INTEGER, TextListTest TEXT[], …
0
votes
1 answer

Open DuckDB database using Python and DBeaver

I created a DuckDB file using the following Python code: import duckdb con = duckdb.connect(database=':memory:') con = duckdb.connect(database='db.duckdb', read_only=False) con.execute("CREATE TABLE items(item VARCHAR, value DECIMAL(10,2), count…
Ohumeronen
  • 1,769
  • 2
  • 14
  • 28
0
votes
2 answers

Is there any easier way to convert a string that looks like a list in DuckDB than regexp_split_to_array or string_split?

I am attempting to query a Pandas Dataframe with DuckDB that I materialize with read_sql_query. Some of this data is stored in a JSON format and in the target column each value has a list of items - ["Value1", "Value2", "Valueetc"] that from the…
0
votes
1 answer

Insert into duckdb table using dplyr

I'm working with duckdb in R and would like to insert a record using dplyr. I have included an example below of what I'm trying to achieve. First, I load my libraries, create a toy dataset, then load it into a database. # Load…
Dan
  • 11,370
  • 4
  • 43
  • 68
0
votes
0 answers

Query to find the average time (in minute) it takes users to make payment of their orders

As seen below, I have a table called trans. I'm looking for the average time (in minutes) it takes users to pay for their orders in the table. trans here is the code I tried: import duckdb con =…
Duta
  • 1
0
votes
1 answer

How should I fix this code with RuntimeError: Catalog Error: Schema with name a does not exist!? (SQL)

I have 2 tables as shown below: I want to find the Employee's Full name with Grad and Salary which order by City. Here is my code: import duckdb con = duckdb.connect(database=':memory:') con.execute(""" select a.CONCAT(FirstName , ' '…
0
votes
1 answer

Query parquet file with DuckDB throws Runtime Error: "Payload value bigger than allowed. Corrupted file?"

I am using DuckDB to query a parquet file. The query is like: SELECT * FROM 'yellowcab.parquet' limit 5; DuckDB throws Runtime Error: Payload value bigger than allowed. Corrupted file?. And it doesn't seem to be an issue in the parquet file because…
Haojin
  • 304
  • 3
  • 11
0
votes
2 answers

create a polars dataframe containing unique values from a set of CSVs

I have +3000 CSVs with +10 columns. What I need is to get all unique values from just two of these. I am able to read unique values in polars: import polars as pl df1 = pl.read_csv("test1.biobank.tsv.gz", sep='\t', dtype={"#chrom": pl.Utf8},…
darked89
  • 332
  • 1
  • 2
  • 17
0
votes
1 answer

Extract rows with duplicate values in two or more fields but different values in another field - with SQL

This is a follow-up from the thread Extract rows with duplicate values in two or more fields but different values in another field and the goal is to do the same thing for bigger-than-RAM data with SQL with DuckDB. The dupKeyEx() function code in…
San
  • 518
  • 5
  • 14
0
votes
2 answers

Measure time performance of ResultSet from SQL queries

I have some issues regarding the understanding of "ResultSet". If I want to measure the performance time it take to execute the query, do I need to iterate through the Resultset --> while(rs.next()), since the actual result set includes already has…
dave8s
  • 13
  • 2
0
votes
0 answers

unexpected error in dbplyr dispatch of tidyr::expand() when using multiple columns

The following example from the docs of tidyr::expand() works as expected on a local, in-memory data.frame. The same example works on a tbl_dbi object when called on a single column, but displays an SQL parsing error when applied to multiple…
cboettig
  • 12,377
  • 13
  • 70
  • 113
0
votes
1 answer

Insert Python DuckDB table into SQL statement

I am trying to use a registered virtual table as a table in a SQL statement using a connection to another database. I can't just turn the column into a string and use that, I need the table/dataframe itself to work in the statement and join with…
J. Alexander
  • 53
  • 1
  • 8
0
votes
1 answer

How do I insert Infinity values into REAL/FLOAT4 Columns in DuckDB?

Say I have a table called table_1 with a single column called col_1 that is of type REAL/FLOAT4 in DuckDB. How do I insert special float values like -Infinity, NaN, and Infinity into col_1? I have tried inserting those values many different ways…
Hawk
  • 3
  • 3
1 2 3
11
12