1

I have a list of IDs that are not associated with any actual data. I have a SQL database that has a table for each of these IDs, and those tables have data that I would like to combine together into one large data frame based on the list of IDs that I have. I figured a for loop would be needed for this, but I haven't been able to get it to work properly.

For example I have a list of IDs" 1,2,3,4,5

I have a SQL database with tables for each of these, and they also have other data associated with the IDs. Each ID has multiple rows and columns.

I would like my end product to be the combination of those rows and columns for the list of IDs to be in a single data frame in r. How could I do this? What is the most efficient way to do so?

#Example data set
library(lubridate)
date <- rep_len(seq(dmy("26-12-2010"), dmy("20-12-2011"), by = "days"), 500)
ID <- rep(seq(1, 5), 100)

df <- data.frame(date = date,
                 x = runif(length(date), min = 60000, max = 80000),
                 y = runif(length(date), min = 800000, max = 900000),
                 ID)

for (i in 1: length(ID)){
  ID[i] <- dbReadTable(mydb, ID[i])
}

Thank you so much for your time.

John Huang
  • 845
  • 4
  • 15
  • You mean that you have a table in your SQL database named `1`? – r2evans Apr 26 '21 at 19:03
  • Yes, I have a table for each ID in the database and I would like to read them into R and combine them into a single data frame. – John Huang Apr 26 '21 at 19:07
  • 1
    `IDs <- lapply(ID, function(i) dbReadTable(mydb, i))` should give you a list of tables. Is that what you're after? – r2evans Apr 26 '21 at 19:12

1 Answers1

2

I'll expand on my comment to finish the question.

IDs <- lapply(setNames(nm=ID), function(i) dbReadTable(mydb, i)) 

and then one of:

## base R
IDs <- Map(function(x, nm) transform(x, id = nm), IDs, names(IDs))
DF <- do.call(rbind, IDs)

## dplyr
DF <- dplyr::bind_rows(IDs, .id = "id")

## data.table
DF <- data.table::rbindlist(IDs, idcol = "id")

The addition of the "id" column is to easily differentiate the rows based on the source ID. If the table already includes that, then you can omit the Map (base) and .id/idcol arguments.

(This assumes, btw, that all tables have the same exact structure: same column names and same data types.)

r2evans
  • 141,215
  • 6
  • 77
  • 149