Issues related to the usage of DuckDB (www.duckdb.org)
Questions tagged [duckdb]
180 questions
1
vote
1 answer
R: DuckDB DBconnect is very slow - Why?
I have a *.csv file containing columnar numbers and strings (13GB on disk ) which I imported into a new duckdb (or sqlite) database and saved it so I can access it later in R. But reconnecting duplicates it and is very slow, is this wrong?
From…

HCAI
- 2,213
- 8
- 33
- 65
1
vote
1 answer
PInvoke struct with nested struct array
I'm trying to PInvoke a method which has a struct parameter with nested struct array pointer. The c declaration looks like this:
duckdb_state duckdb_query(duckdb_connection connection, const char *query, duckdb_result *out_result);
typedef struct…

Giorgi
- 30,270
- 13
- 89
- 125
0
votes
0 answers
How to view DuckDB database in DBeaver?
I created a DuckDB database in python and I'm getting:
con = duckdb.connect('../data/proposal.db')
con.sql("SELECT COUNT(*) FROM proposals")
>>>
┌──────────────┐
│ count_star() │
│ int64 │
├──────────────┤
│ 200000…

ruslaniv
- 458
- 1
- 6
- 14
0
votes
1 answer
Not seeing file-level pushdown predicate filtering querying hive-partitioned table in S3
I am using DuckDB in DuckDB-WASM. I am creating a view on top of a hive-partitioned table in S3 with SQL like:
create or replace view my_view as
select
Part1 as part_1
, Part2 as part_2
, Column1 as column_1
, Column2 as column_2
from…

Dude0001
- 3,019
- 2
- 23
- 38
0
votes
0 answers
Write a Dataframe as Parquet file in S3 Bucket with DuckDB-Python API
I have a DuckDB Dataframe with 5 GB of Data, I would like to write the same to S3 Bucket as Parquet file, I see DuckDB Commands, but not able to find the python API for the same, any help he is appreciated

Sandeep540
- 897
- 3
- 13
- 38
0
votes
1 answer
correlated subqueries in duckdb
I'm writing this correlated subquery in duckdb and I cant figure out why its not working. can someone explain why? Many thanks!
select
o.__policy,
o.__timestamp,
q.__filename as lastquotefile
from b_n_f b
left join q_n_f q
on b.__policy =…

smaillis
- 298
- 3
- 12
0
votes
2 answers
Unable to access AWS S3 parquet file from AWS Lambda using duckdb
I have a parquet file stored in AWS S3. Assume the location is s3://bucket/file.parquet. I defined a function in AWS Lambda to access this parquet file by using the code below.
import os
def lambda_handler(event, context):
import…

pass-by-ref
- 1,288
- 2
- 12
- 20
0
votes
2 answers
How to count total unique user doing transaction each month
Hi there is a order_table, the table contains the following fields
order_id, user_id, item_id, gmv, order_time.
I already write to find the month from transactions
%%sql
SELECT
DISTINCT bulan AS bulan_transaksi
FROM (
SELECT
…

Bin Ski.
- 849
- 1
- 8
- 10
0
votes
1 answer
How to dynamially write a csv in duckdb?
I am running the same analysis across multiple directories with the same file structure. I just change the file_search_path to the right directory and it works great. The issue I'm havin is how to dynamically save my csv files with different…

yake84
- 3,004
- 2
- 19
- 35
0
votes
1 answer
Writing .parquet from duckdb prefixes column names with "PARGO_PREFIX_"
DuckDB is changing my column names as I write out to .parquet file, and I can't figure out why.
In a DuckDB memory only instance (on Ubuntu 23.04) I run:
create table mytable (_id int, str varchar, num int);
insert into mytable (_id, str, num)…

Jeff Breadner
- 1,366
- 9
- 19
0
votes
2 answers
Replace white space from column name with underscore in DuckDB Python client API
I have a DuckDB table whose column names have white spaces, and I'd like to just specify a blanket rule that says "for all columns with spaces, replace it with an underscore". I know how to do this by converting the table to a Polars DataFrame, but…

prrao
- 2,656
- 5
- 34
- 39
0
votes
1 answer
Subquery returning multiple columns in duckDB
I would like to group by first_name and for each fist_name get the lowest age.
My query which I run in online sql compiler works fine but when I try to use duckDB in python I get error that I try to return multiple columns, but this is exactly what…

Kucharsky
- 201
- 3
- 16
0
votes
1 answer
How can I select or alias a duckdb relation column which has an aggregate function in its column name using the Python-API?
The DuckDB Python API lets you compose complex queries by building it up from chained functions on a relation. For example, to do a group by, one can do a simple select, and then use the aggregate function on the select relation like this:
rel =…

tomanizer
- 851
- 6
- 16
0
votes
1 answer
read_json_auto in DuckDb without involving files
I'm looking for a way to build up a DuckDB table akin to read_json_auto with the following constraints:
Must work in-memory only. I want to avoid having to load a file from FS
Must be cross-plattform compatible
Is there a way to do…

Bogey
- 4,926
- 4
- 32
- 57
0
votes
1 answer
Linking two containers in a single task definition in AWS fargate
Hi I am trying to deploy two containers one for DUCKDB which gets data from my s3 bucket and the other is a streamlit container which displays the frontend with some text box and dashboarding on the data collected from S3( A text box to run some…