Questions tagged [sqldf]

SQLDF is an R package for running SQL statements on R data frames.

SQLDF is an R package for running SQL statements on R data frames.

The user simply specifies an SQL statement in R using data frame names in place of table names and a database with appropriate table layouts/schema is automatically created, the data frames are automatically loaded into the database, the specified SQL statement is executed, the result is read back into R and the database is deleted all automatically behind the scenes.

sqldf supports:

  • the SQLite backend database (by default),
  • the H2 java database,
  • the PostgreSQL database and
  • sqldf 0.4-0 onwards also supports MySQL.

Resources:

684 questions
0
votes
2 answers

Multiple cumulative sums

Hopefully the title is explicit enough. I have a table looking like that : classes id value a 1 10 a 2 15 a 3 12 b 1 5 b 2 9 b 3 7 c 1 6 c 2 14 c 3 6 and here is what I would like…
François M.
  • 4,027
  • 11
  • 30
  • 81
0
votes
1 answer

Can't import sqldf in one python file while it works in another. Both files are in different folders. How can I set this right?

In one python file when I try from pandasql import sqldf it works. The path for that is C:\Users\AmitSingh\Anaconda2\python.exe "C:/Users/AmitSingh/PycharmProjects/HelloPython.py/exercise 2" In another file when I use the same command it gives me…
Amit Singh Parihar
  • 527
  • 3
  • 14
  • 23
0
votes
0 answers

iGraph Weighted Network Plot

I haven't found a question that covers my situation exactly. I have a list of web traffic nodes that I want to plot and keep the plot readable. Of the thousands of transactions in a day there are only about 152 different page couplets. I want to…
TimL
  • 49
  • 5
0
votes
1 answer

I want to use a loop with the R package sqldf

Hi I have an array in R called "Puestos" and I want to pass all the values in the array into a sqldf query, I tried this: library(sqldf) for (i in Puestos){ statement = sprintf("select mes, puesto, sum(numero) as numero,sum(faltas) as…
Angel
  • 23
  • 4
0
votes
2 answers

keeping leading zeros with sqldf

I am a total SQL ignoramus so I apologize if this is very simple.. I have data that contains an ID column consisting of numbers, and in many cases contains leading zeros. I would like to import the data using sqldf, but in doing so I lose the…
statsNoob
  • 1,325
  • 5
  • 18
  • 36
0
votes
1 answer

Using sqldf on data with footer

I have some data that looks something like this: myDat <- data.frame(V1=rnorm(1000), V2=sample(LETTERS, 1000, replace=TRUE), V3=sample(1:1000, 1000)) write.table(myDat, file="myData.txt", sep="|",…
statsNoob
  • 1,325
  • 5
  • 18
  • 36
0
votes
3 answers

Issue with group by statement in sqldf

I have a dataset with two columns(character) - text, source , I want to create a dataset such that if a particular value in text column is present less than 5 times. I am using below code but it is giving only first occurrence instead I need all the…
0
votes
1 answer

Regarding sqldf package/regexp function

I am using sqldf package and sql analyze one table generated by a classification model. But when I use the code: table<-sqldf(" SELECT a, b, c, d, e, f, CASE WHEN (REGEXP_LIKE(t, '\b(2nd time|3rd time|4th time)\b')) = TRUE THEN 1 ELSE 0 END AS…
WayToNinja
  • 285
  • 4
  • 14
0
votes
1 answer

How to use row_number and partition function in sqldf

Update I can run below sql query in netezza database, but it goes wrong in sqldf package in R > sqldf("SELECT TEXT, + VEH_MAKE_NM, + NEW_USED_CD, + PRODUCT, + OVERALL_SUBV_IND, + AS_OF_DATE, + CATEGORY, + ROW_NUMBER() OVER(PARTITION BY TEXT,…
WayToNinja
  • 285
  • 4
  • 14
0
votes
2 answers

Scraping or processing results of SQL API calls

I ma trying to download data and do some analysis from the following Open Data webpage http://data.ci.newark.nj.us/dataset/new-jersey-education-indicators/resource/d7b23f97-cba5-4c15-997c-37a696395d66 They have given some examples like this query…
BigDataScientist
  • 1,045
  • 5
  • 17
  • 37
0
votes
1 answer

Problems with sqldf: cannot select on date

The problem is similar with this one (R, issue with sqldf: cannot make condition on date) but with different error term: Suppose I have a table: OrderDate Sales 2005-02-28 12 2005-02-28 234 ... In the original R DataFrame, the data type of…
Surah Li
  • 573
  • 1
  • 4
  • 6
0
votes
1 answer

R sqldf package Month function

I'm trying to use native sql functions within sqldf such as month. I am not sure why this doesn't work. There is monthly data in the dataframe which I am pulling from so I don't understand. If I subset it in dbGetQuery, then it works since it is…
0
votes
3 answers

Count of a value in a new column in R

I have a dataframe with Transaction ID and Product Name as columns. I'm trying to create a 3rd column which gives me the count of Transaction ID's. The final dataframe should look as shown below. TID Product Orders 100 …
red
  • 53
  • 5
0
votes
2 answers

Return rows with both NA and a value at once

I have a df ID <- c(101,102,103,104) Status <- c('P','F_Avg','F_Sig',NA) df <- data.frame(ID,Status) I am trying to filter the failed ones and return both Pass and NA but I am not able to do so. I know it's a basic question but please bear with…
Sharath
  • 2,225
  • 3
  • 24
  • 37
0
votes
0 answers

MySQL in R: using sqldf and RLIKE

I am trying to run a huge boolean sqldf in for a local dataframe in R like so: sqldf("UPDATE df SET STORE_FLAG = 1 WHERE CONTENTS RLIKE '(([[:<:]]store[[:>:]]))'", drv='SQLite') Query abbreviated above. The actual query is 20 words long with near…
lmcshane
  • 1,074
  • 4
  • 14
  • 27