Questions tagged [sqldf]

SQLDF is an R package for running SQL statements on R data frames.

SQLDF is an R package for running SQL statements on R data frames.

The user simply specifies an SQL statement in R using data frame names in place of table names and a database with appropriate table layouts/schema is automatically created, the data frames are automatically loaded into the database, the specified SQL statement is executed, the result is read back into R and the database is deleted all automatically behind the scenes.

sqldf supports:

  • the SQLite backend database (by default),
  • the H2 java database,
  • the PostgreSQL database and
  • sqldf 0.4-0 onwards also supports MySQL.

Resources:

684 questions
4
votes
1 answer

SQL querying dataframes inside list

Given the dataframes df1 <- data.frame(CustomerId=c(1:6),Product=c(rep("Toaster",3),rep("Radio",3))) df2 <- data.frame(CustomerId=c(2,4,6),State=c(rep("Alabama",2),rep("Ohio",1))) are stored in a list dflist <- c(df1,df2) how do I run sqldf…
dmvianna
  • 15,088
  • 18
  • 77
  • 106
3
votes
1 answer

Export a SQLite table to Apache parquet without creating a dataframe

I have multiple huge CSV files that I have to export based on Apache Parquet format and split them into smaller files based on multiple criteria/keys (= column values). As I understand Apache arrow is the R package allowing to work with Apache…
user17911
  • 1,073
  • 1
  • 8
  • 18
3
votes
0 answers

Is there an easy way to pre-filter data into vroom?

In R, I have switched to using vroom due to its speed at reading in large delimited files, but I cannot find a simple way to pre-filter large datasets as I could do with say the sqldf package or through using SQLite and dplyr as described here The…
kam
  • 31
  • 1
3
votes
2 answers

R delete rows from data table using sqldf

I am wondering if R does not support using sqldf to delete rows from a data table. My data looks like this and I am trying to delete from a data table using a delete statement. There is no underlying database just a data.table. But hwen I enter the…
Jim Ryan
  • 129
  • 2
  • 11
3
votes
4 answers

How to get sum of column from sqldf output in R?

I would like to sum a single column of data that was output from an sqldf function in R. I have a csv. file that contains groupings of sites with a uniqueID and their associated areas. For example: occurrenceID …
Zsimek
  • 33
  • 5
3
votes
2 answers

Using sqldf with r variable containing underscore in its name

This code > A <- data.frame(col1 = c(1,2,3),col2 = c("red","blue","green")) > color_num <- 2 > fn$sqldf("select * from A where col1 >= '$color_num'") yields the error Error in eval(parse(text = paste(..., sep = "")), env) : object 'color' not…
Rookatu
  • 1,487
  • 3
  • 21
  • 50
3
votes
2 answers

R - sqldf() returning dataframe with zero rows

I am trying an efficient means of reading in select data from very large csv files using the method described here by @JD_Long. The method relies on sqldf() a function from a library by the same name. The problem I am having is that the function…
Conner M.
  • 1,954
  • 3
  • 19
  • 29
3
votes
5 answers

Select entire group if one selection has the given value

datetime label option_title option_value lead difference 1 2016-07-22 GE 3 - Commercial Review 3 2 -1 2 2017-02-20 …
rfortin
  • 184
  • 8
3
votes
1 answer

Organize data with maximum and minimum values in r

I have a table like this: which is generated by following code: id <- c("1","2","1","2","1","1") status <- c("open","open","closed","closed","open","closed") date <- c("11-10-2017 15:10","10-10-2017 12:10","12-10-2017 22:10","13-10-2017…
Meilun HE
  • 31
  • 3
3
votes
2 answers

r sqldf escape double quotes

I wish to run the following sql statement within sqldf(): select columnA, "new_column_value" as columnB, "column.C" from mytable where columnA in ('123','456') but when I run within sqldf : sqldf('select columnA, "new_column_value" as columnB,…
brucezepplin
  • 9,202
  • 26
  • 76
  • 129
3
votes
2 answers

Efficient cross join with aggregation and filter

As per the title, I am looking to do a cross join with a table which performs an aggregation function and filters on a couple of variables within the table. I have similar data to the…
SlyGrogger
  • 317
  • 5
  • 16
3
votes
1 answer

r language - sqldf package not seeing any of my datafiles

I have done a fresh install of package sqldf on my system, but whenever i run any sql queries I get Error in rsqlite_send_query(conn@ptr, statement) : no such table regardless of the name or table I have looked through all guides on setting…
3
votes
1 answer

Extract all rows containing first value for each unique value of another column

I am looking for something similar to this Select only the first rows for each unique value of a column in R but I need to keep ALL rows containing the first values of year per ID. In ither words, I need to subset the dataset on the first year…
Lisarv
  • 87
  • 1
  • 9
3
votes
4 answers

Count how many orders were shared between customers

I have a table that has two columns Order | CustomerID 1. A | C1 2. B | C1 3. C | C1 4. D | C2 5. B | C3 6. C | C3 7. D | C4 Its a long table. I want an output that shows C1 | C3 | 2 #Customer C1 and Customer C3 share 2 orders (i.e.…
user2162611
  • 146
  • 3
  • 13
3
votes
3 answers

Select max amount spent in single order

I am very new to R and sqldf and can't seem to solve a basic problem. I have a file with transactions where each row represents a product purchased. The file looks like this: customer_id,order_number,order_date, amount, product_name 1, 202,…