Questions tagged [sqldf]

SQLDF is an R package for running SQL statements on R data frames.

SQLDF is an R package for running SQL statements on R data frames.

The user simply specifies an SQL statement in R using data frame names in place of table names and a database with appropriate table layouts/schema is automatically created, the data frames are automatically loaded into the database, the specified SQL statement is executed, the result is read back into R and the database is deleted all automatically behind the scenes.

sqldf supports:

  • the SQLite backend database (by default),
  • the H2 java database,
  • the PostgreSQL database and
  • sqldf 0.4-0 onwards also supports MySQL.

Resources:

684 questions
2
votes
2 answers

Any other process faster than sqldf in r

I have 2 dataframes df1 has 700,000 datapoints ID1: Categorical Variable has 400+ categories Date1: 1 year data MST1: 2 level variable 1/2 Confirmation Number: unique Identifier sample data: ID1 ExtractDate1 MktSeg1 ConfirmationNo 145 …
Toros91
  • 303
  • 2
  • 11
2
votes
0 answers

check Primary key in sqldf package

While creating a database using the sqldf package, I created a table named Company Master. I want to link it to different tables but before doing that i wanted to check if the format is correct it or not. In MySQL it tells immediately if the format…
mrigank shekhar
  • 544
  • 3
  • 15
2
votes
1 answer

Strange warning message: In FUN(X[[i]], ...) : NAs introduced by coercion

I found a strange behaviour of sqldf - it issues "NAs introduced by coercion" warning when a variable exists with the same name as a column and different data type. The following code: x <- structure(list(euring = c(12380, 12430), species =…
Tomas
  • 57,621
  • 49
  • 238
  • 373
2
votes
0 answers

Warning: Error in aggregate.data.frame: arguments must have same length Stack trace (innermost first)

I have an r assignment. I am making an interactive plot for a sports team and when I try to calculate the average of the scored goals I keep getting the same error (Error in aggregate.data.frame: arguments must have same length Stack trace…
John
  • 21
  • 6
2
votes
2 answers

RSQLite Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : namespace ‘DBI’ 0.4-1 is being loaded, but >= 0.8 is required

When I load sqldf package, I got error message below. How can I fix this? install.packages("sqldf", dep = TRUE) library(sqldf) RSQLite Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : namespace ‘DBI’ 0.4-1 is being…
melik
  • 1,268
  • 3
  • 21
  • 42
2
votes
2 answers

generate sequence (and starting over in case of a recurrence) and add new column with highest number per sequence, within group, in R

I am looking for a way to generate a sequence for a column with names of cities grouped by an ID. What is crucial is that when a name of a city is repeated (within the group) a new sequence has to start. A new sequence should also start in case of a…
Thijs
  • 21
  • 3
2
votes
3 answers

filter by date range in sqldf

I'm trying to filter a dataframe based on a date range using sqldf, like the example code below. I have data like the example data below. The datedf dataframe that gets returned by sqldf has no records. There are records in the SHV dataframe in…
modLmakur
  • 531
  • 2
  • 8
  • 24
2
votes
1 answer

Get max for each row between two dates?

learning sql here and I ran into a challenge. I have the following table: tbl <- data.frame( id_name = c("a", "a", "b", "c", "d", "f", "b", "c", "d", "f"), value = c(1, -1, 1, 1, 1, 1, -1, -1, -1, -1), score = c(1, 0, 1, 2, 3, 4, 3, 2, 1,…
2
votes
2 answers

Putting sqldf inside r function

is there any way to put sqldf queries inside user defined functions? I have gone through this : http://r.789695.n4.nabble.com/Passing-Multiple-Variable-Into-SQLDF-Statement-as-parameters-of-function-td4636147.html , R call variable inside sqldf.…
Sayak
  • 183
  • 1
  • 11
2
votes
1 answer

R sqldf - index inside / outside a loop

I'm using the sqldf function to repeatedly join a subset of a table against itself. The repeat process is happening inside a for loop. I've read that adding an index can improve performance of these joins here. My question is - if I'm repeatedly…
screechOwl
  • 27,310
  • 61
  • 158
  • 267
2
votes
2 answers

Retrieve all rows with same minimum value for a column with sqldf

I have to retrieve IDs for employees who have completed the minimum number of jobs. There are multiple employees who have completed 1 job. My current sqldf query retrieves only 1 row of data, while there are multiple employee IDs who have completed…
pyeR_biz
  • 986
  • 12
  • 36
2
votes
2 answers

sqldf and POSIXct

Having issues wit sqldf and & POSIXct. Will appreciate any help Sys.setenv(TZ = "America/Los_Angeles") user <- read.csv("user_ori.csv",stringsAsFactors=FALSE) user$created_on <- as.POSIXct(user$created_on,format="%Y-%m-%d…
JohnGagliano
  • 54
  • 1
  • 6
2
votes
3 answers

Dplyr Filter Multiple Like Conditions

I am trying to do a filter in dplyr where a column is like certain observations. I can use sqldf as Test <- sqldf("select * from database Where SOURCE LIKE '%ALPHA%' OR SOURCE LIKE '%BETA%' OR SOURCE LIKE…
Drthm1456
  • 409
  • 9
  • 17
2
votes
1 answer

Why two numeric divide, the result is zero? using SQLDF in R

I am using SQLDF in R. I count two different values from column 2 and column 3. I want to calculate column 2 / column 3, but the result is always zero. Why? Here is my code: prouterInboundin4hours<-sqldf( " select `1st Segment Delivered Success…
Shuo Liu
  • 53
  • 2
  • 7
2
votes
0 answers

Using R, Error in rsqlite_send_query(conn@ptr, statement) : too many SQL variables

I uses sqldfpackage to make SQLite database, my matrix dimension is 2880x1951. I write the table on the SQLite database, unfortunately it asks Error in rsqlite_send_query(conn@ptr, statement) : too many SQL variables. I read from SQlite website if…
Honstel
  • 29
  • 4