Questions tagged [sqldf]

SQLDF is an R package for running SQL statements on R data frames.

SQLDF is an R package for running SQL statements on R data frames.

The user simply specifies an SQL statement in R using data frame names in place of table names and a database with appropriate table layouts/schema is automatically created, the data frames are automatically loaded into the database, the specified SQL statement is executed, the result is read back into R and the database is deleted all automatically behind the scenes.

sqldf supports:

  • the SQLite backend database (by default),
  • the H2 java database,
  • the PostgreSQL database and
  • sqldf 0.4-0 onwards also supports MySQL.

Resources:

684 questions
7
votes
2 answers

Getting no such table error using pandas and sqldf

I am getting a sqlite3 error. OperationalError: no such table: Bills I first call my dataframes using pandas and then call those dataframes in my query which works fine import pandas as pd from pandasql import sqldf Bills =…
David Sung
  • 519
  • 1
  • 6
  • 14
7
votes
5 answers

Skip metadata when Importing dataset in R

My question involves how to skip metadata in the beginning of a file when importing data into R. My data is in .txt format where the first lines are metadata describing the data and these need to be filtered out. Below is a minimal example of the…
amo
  • 3,030
  • 4
  • 25
  • 42
7
votes
1 answer

sqldf: query data by range of dates

I am reading from a huge text file that has '%d/%m/%Y' date format. I want to use read.csv.sql of sqldf to read and filter the data by date at the same time. This is to save memory usage and run time by skipping many dates that I am not interested…
biocyberman
  • 5,675
  • 8
  • 38
  • 50
7
votes
2 answers

Merging two dataframes on a date range in R

In R I want to merge two dataframes on a range of dates, taking all rows from the second dataframe which fall on and between two columns of dates from the first dataframe. I couldn't find a strictly R function or version of the merge function that…
Misc
  • 645
  • 1
  • 7
  • 20
7
votes
1 answer

Call R functions from sqldf queries

Is there a way to call R functions from sqldf queries? E.g. sqldf("select paste('Hello', 'World')") Or, is there a way to define custom functions or stored procedures within the SQLite engine behind sqldf? (I am using sqldf with plain old…
Michael Malak
  • 628
  • 8
  • 19
7
votes
1 answer

Reading huge csv files into R with sqldf works but sqlite file takes twice the space it should and needs "vacuuming"

Reading around, I found out that the best way to read a larger-than-memory csv file is to use read.csv.sql from package sqldf. This function will read the data directly into a sqlite database, and consequently execute a sql statement. I noticed the…
nassimhddd
  • 8,340
  • 1
  • 29
  • 44
7
votes
3 answers

Fast way to select rows within table in R?

I am looking for a fast way to extract a large number of rows from an even larger table. The top of my table is as follows: > head(dbsnp) snp gene distance rs5 rs5 KRIT1 1 rs6 rs6 CYP51A1 1 rs7 rs7 LOC401387 …
Gordon Freeman
  • 125
  • 2
  • 5
6
votes
1 answer

sqldf can't find the data with error "no such table"

I've been using sqldf in my R-scripts until now when I got the following error: library(sqldf) data(mtcars) out <- sqldf("SELECT * FROM mtcars") > Error in rsqlite_send_query(conn@ptr, statement) : no such table: mtcars This hasn't been a problem…
user134489
  • 355
  • 1
  • 3
  • 10
6
votes
0 answers

Getting warning messages in R after updating sqldf

I just updated all R packages including sqldf today. Then I kept getting warning message as follows: Warning message: Quoted identifiers should have class SQL, use DBI::SQL() if the caller performs the quoting. What is the problem here? Is this…
LaTeXFan
  • 1,136
  • 4
  • 14
  • 36
6
votes
2 answers

sqldf : create table from data frame error: "no such table". and two tables created instead of one

I've recently upgraded R, RSQLite, and sqldf (versions below). Normally: sqldf('create table foo as select * from bar', db = 'test.db') should create a table called 'foo' in an attached sqlite database, using data frame 'bar' if it exists to load…
Nathan Siemers
  • 423
  • 4
  • 8
6
votes
3 answers

parsing quotes out of "NA" strings

My dataframe has some variables that contain missing values as strings like "NA". What is the most efficient way to parse all columns in a dataframe that contain these and convert them into real NAs that are catched by functions like is.na()? I am…
jgozal
  • 1,480
  • 6
  • 22
  • 43
6
votes
1 answer

Unable to use rank() over functions in R using sqldf

arm<-as.data.frame(matrix(c(1,1,1,2,2,6,7,4,9,10),ncol=2)) colnames(arm)<-c("a","b") This is a dataset I created in R. Now I want to rank the column b and group by column a. The following piece of code is throwing this error, no matter what…
Bill Stoyanov
  • 79
  • 1
  • 3
6
votes
1 answer

Apply CASE WHEN in sqldf statement for manipulating multiple columns

I have a dataframe datwe with 37 columns. I am interested in converting the integer values(1,2,99) in columns 23 to 35 to character values('Yes','No','NA'). datwe$COL23 <- sqldf("SELECT CASE COL23 WHEN 1 THEN 'Yes' …
Prradep
  • 5,506
  • 5
  • 43
  • 84
6
votes
2 answers

R intersect data.frame on multiple criteria

I am trying to populate a binary vector based on the intersection of two data.frames on multiple criteria. I have the code working but I feel that it is memory excessive just to get the binary vector. When I apply my code to my full data (40mm+…
Brad
  • 813
  • 1
  • 10
  • 20
6
votes
2 answers

Difference between PROC SQL and sqldf

I am in the midst of re-writing a SAS program I normally use into R. Many statements in the SAS program are written with proc sql, so for ease of programming I am replicating those statements with sqldf in R. All of the sql code works the same with…
user1445246
  • 303
  • 4
  • 14
1
2
3
45 46