Questions tagged [sqldf]

SQLDF is an R package for running SQL statements on R data frames.

SQLDF is an R package for running SQL statements on R data frames.

The user simply specifies an SQL statement in R using data frame names in place of table names and a database with appropriate table layouts/schema is automatically created, the data frames are automatically loaded into the database, the specified SQL statement is executed, the result is read back into R and the database is deleted all automatically behind the scenes.

sqldf supports:

  • the SQLite backend database (by default),
  • the H2 java database,
  • the PostgreSQL database and
  • sqldf 0.4-0 onwards also supports MySQL.

Resources:

684 questions
2
votes
1 answer

case when in sqldf / R

If I have DF A B Col 1 1 A 2 2 B 1 2 C 2 1 D 1 3 E 2 3 F I try to use sqldf as follows Test <- sqldf(" SELECT A, case when A = '1' and B = '1' then Col else…
Rtab
  • 123
  • 10
2
votes
3 answers

How to query a dataframe using a column of other dataframe in R

I have 2 dataframes in R and I want to do a query using the dataframe "y" like parameter to dataframe "x". I have this code: x <- c('The book is on the table','I hear birds outside','The electricity came back') x <- data.frame(x) colnames(x) <-…
Robson Brandão
  • 186
  • 2
  • 9
2
votes
2 answers

Make new feature using 2 tables

table1 <- data.frame(user_id=c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2), product_id = c(14, 24, 38, 40, 66, 2, 19, 30, 71, 98, 7, 16), first_order = c(1, 2, 1, 4, 5, 3, 2, 4, 2, 4, 2, 3), last_order =…
Ashtray K
  • 145
  • 10
2
votes
3 answers

Loading data with RSQLite which has quoted values

I am trying to load a large-ish csv file into a SQL lite database using the RSQLite package (I have also tried the sqldf package). The file contains all UK postcodes and a variety of lookup values for them. I wanted to avoid loading it into R and…
Tumbledown
  • 1,887
  • 5
  • 21
  • 33
2
votes
2 answers

SQLite removes ties in group by

I am expecting the following to yield more than 3 rows, since there are ties of min(a.[Sepal.Width]) within each of a.[Species], however only 3 rows are returned: sqldf(' select a.[Species], min( a.[Petal.Width]) from iris a group by a.[Species]…
Rasmus Larsen
  • 5,721
  • 8
  • 47
  • 79
2
votes
1 answer

order by factor in sqldf

As far as I can see, it is not possible to order by a factor in sqldf: levels( iris$Species ) <- c("virginica", "versicolor", "setosa") levels(iris$Species) > sqldf(' select distinct iris.[Species] from iris order by iris.[Species] ' ) …
Rasmus Larsen
  • 5,721
  • 8
  • 47
  • 79
2
votes
1 answer

R sqldf not recognizing posixct

I need to filter data by date using sqldf package. My table, "episodes" has a field "created_at, which class is POSIXct. episodes<-data.frame(created_at=seq(from = as.POSIXct('2016-10-01 01:00:00',tz="GMT"), length.out = 100, by = "days") ) >…
GabyLP
  • 3,649
  • 7
  • 45
  • 66
2
votes
2 answers

How to select a specific column with sqldf if no column name is given

I have a large file (data.txt, 35 GB) which has 3 columns. Some example part of the file would look like the following: ... ... ... 5 701565 8679.56 8 1.16201e+006 3193.18 1 1.16173e+006 4457.85 14 1.16173e+006 4457.85 9 …
Fabi
  • 71
  • 1
  • 8
2
votes
1 answer

How to fill in missing dates in range by group

I have a data.frame of groups and dates. How do I fill in all the missing dates in the min-max date range for each group? Ideally I would do this in dplyr. But ultimately, I'd just like to do this efficiently with as few lines of (readable) code as…
lowndrul
  • 3,715
  • 7
  • 36
  • 54
2
votes
3 answers

Updating data.table by inserting new rows that are different from old rows

I have two data.table(dt1 & dt2). dt1 is past product data and dt2 is present product data. I want to create a third data.table that inserts new rows from dt2 into dt1 only when product characteristics(Level or Color) are different or Product itself…
gibbz00
  • 1,947
  • 1
  • 19
  • 31
2
votes
1 answer

Weird sql statement in sqldf and R code

I have to maintain some R code and the guy that wrote it isn't here anymore. I've found some weird sql statements in his code and i am not sure what they mean. This is one of them: sqldf(paste("select ", i," i , * from simTypeFile union all select *…
Subimago
  • 63
  • 5
2
votes
0 answers

How to read in csv files with quotation marks using read.csv.sql

When converting data to csv Excel normally encloses categorical variables which contain commas in quotation marks. For example: \"MÉ******, ****¿. When using the read.csv.sql function, I'm having huge issues with handling the commas, and the…
2
votes
1 answer

R sqldf renaming a field in a select statement

Edited the question to be clearer as requested. Note that the inputs are provided in reproducible form in the Note at the end. I am using sqldf to join two datasets in R The code returns a dataset with the with the original column names, ignoring…
Wael Hussein
  • 135
  • 1
  • 12
2
votes
2 answers

Joining multiple tables with mix data

My data: data1 <- data.frame(from = c(1, 2, 13, 4), to = c(4, 3, 9, 1), values = c(12, 56, 67, 78)) data2 <- data.frame(place = c("NY", "London", "Brest", "Nantes"), id = c(1, 2, 3, 4))…
Wilcar
  • 2,349
  • 2
  • 21
  • 48
2
votes
0 answers

sqldf corrupts utf-8 Values

Running a query using sqldf outputs garbled characters when the query results include characters such as: é,š. When I checked the encoding of those fields in the original df, they are specifically UTF-8, but after the query with sqldf, they become…
dm-guy
  • 566
  • 3
  • 10