Questions tagged [sqldf]

SQLDF is an R package for running SQL statements on R data frames.

SQLDF is an R package for running SQL statements on R data frames.

The user simply specifies an SQL statement in R using data frame names in place of table names and a database with appropriate table layouts/schema is automatically created, the data frames are automatically loaded into the database, the specified SQL statement is executed, the result is read back into R and the database is deleted all automatically behind the scenes.

sqldf supports:

  • the SQLite backend database (by default),
  • the H2 java database,
  • the PostgreSQL database and
  • sqldf 0.4-0 onwards also supports MySQL.

Resources:

684 questions
3
votes
1 answer

Speeding up sqldf in R

I have a program in R that i have run for about a day now and its only reached about 10 percent completion. The main source of slowness comes from having to make thousands of sqldf(SELECT ...) calls from a data set of length ~ 1 million using the R…
mt88
  • 2,855
  • 8
  • 24
  • 42
3
votes
1 answer

Using INSERT statement in sqldf within R

I am trying to use sqldf to insert rows into an existing (but empty) dataframe and I am getting a strange error. Here is my code: set.seed(1) library(sqldf) # define empty dataframe d1 = data.frame(min_x=numeric(), max_x=numeric()) # create a…
Karthik
  • 719
  • 2
  • 11
  • 18
3
votes
2 answers

LEFT OUTER JOIN in sqldf with WHERE and HAVING clauses

I have two data frames in R and I would like to conditionally merge them on id and day. The merge is that the right variables merged to the left variables be as new/fresh/recent as possible, but must be at least three days old. But, if there isn't a…
Richard Herron
  • 9,760
  • 12
  • 69
  • 116
3
votes
1 answer

sqldf and maintainability of R code base

If one is building a substantial, organization-wide code base in R, is it acceptable practice to rely on the sqldf package as the default approach for data munging tasks? Or is best practice to rely on operations with R specific syntax where…
Ben Rollert
  • 1,564
  • 1
  • 13
  • 21
3
votes
0 answers

row.names in read.csv vs read.csv.sql (package sqldf)

the description for the row.names argument in read.csv.sql simply says "As in ‘read.csv’" However when I try to read in a simple csv file with the first column as the row names the behavior in read.csv.sql is not what I expect. d <-…
3
votes
1 answer

DATEPART() in sqldf

Is it possible to use the SQL command typical to SQL Server 2008 DATEPART() in sqldf? I am scanning the documentation but not finding anything related to it, I am unfamiliar with SQLite, so if I should go that way and read then I will I want to do…
MCP_infiltrator
  • 3,961
  • 10
  • 45
  • 82
3
votes
1 answer

Query on multiple tables using dbGetQuery of RMySQL package

With the help of dbConnect, multiple connections were established with SQL DBs (say, DB1 and DB2). How can I write a query that involves tables from DB1 and DB2? Does dbGetQuery allow querying one only one DB? Can sqldf package be leveraged after…
Anindita
  • 31
  • 3
3
votes
3 answers

Check that connection is valid

I'm using RPostgreSQL and sqldf inside my function like this: MyFunction <- function(Connection) { options(sqldf.RPostgreSQL.user = Connection[1], sqldf.RPostgreSQL.password = Connection[2], …
Tomas Greif
  • 21,685
  • 23
  • 106
  • 155
3
votes
3 answers

How to handle column names not supported by sqldf in R

I've a data frame where some of the column names are of the format . format. For ex: Company.1 when i'm using that column in a sqldf function it throws an error data=sqldf(select Company.1 from test) Error in sqliteExecStatement(con, statement,…
Prasun Velayudhan
  • 537
  • 1
  • 7
  • 19
3
votes
4 answers

How can I perform full outer joins of large data sets in R?

I am trying to do data analysis in R on a group of medium sized datasets. One of the analyses I need to do requires me to do a full outer join amongst around 24-48 files, each of with has about 60 columns and up to 450,000 lines. So I've been…
Drew75
  • 277
  • 1
  • 3
  • 11
3
votes
3 answers

Unable to install sqldf on Linux

I'm running R version 2.14.1 on Linux. When I try to install sqldf with install.packages(sqldf, dependencies=TRUE) I get the following errors: (these errors result in running the command from the terminal as with the use of the Rcmdr UI) open log…
Max van der Heijden
  • 1,095
  • 1
  • 8
  • 16
3
votes
0 answers

R/sqldf/mclapply, How can I use sqldf and mclapply together?

Hi I am trying to use sqldf to fetch data from my database. Since sqldf will always load tcltk, I can not use mclapply function. How can I do with that? Thanks. Here is an example. options(gsubfn.engine =…
user1589
  • 151
  • 1
  • 5
2
votes
2 answers

stating quote as "quote" in sqldf

I have a csv file which sep="\t", quote=TRUE, so the data is quoted like "2011-01-11" I used the following script to import the csv file into R temp <- sqldf("select * from dummy limit 10",file.format=list(header=TRUE,sep="\t",quote="\"")) But it…
lokheart
  • 23,743
  • 39
  • 98
  • 169
2
votes
2 answers

How can I rbind with missing dataframes?

I'm doing campaign finance data and am searching for employees of x corporation who have donated x amount of money between 1989 and 2010. Using sqldf I was able to parse out this information. The data is >4gb where my ram is <2gb so I have broken…
Tyler
  • 1,050
  • 2
  • 14
  • 24
2
votes
1 answer

Does the R sqldf() function support a way to show the columns of a table (i.e. describe)

Does the R function sqldf() support a way to show table columns similar to what the sql command "desc my_table" would do. It does not seem to support "desc my_table" or "describe my_table" and I'm not seeing a way to do this in the documentation at…
John
  • 3,458
  • 4
  • 33
  • 54