I've a function which objective is to fetch daily data for each variable on a column on a data.frame
. Range is a complete month, but could be any other range.
My df
has a column unit_id
, so I need my function to take the first id
of col unit_id
and fetch the data for every single date of march
.
| unit | unit_id |
|:-----:|----------|
| AE | 123 |
| AD | 456 |
| AN | 789 |
But right now, my function loops the ids in unit_id
col. So as I've 3 ids
, the 4th day the function uses the 1st id again, and then for the 5th day uses the 2nd id
and so on. And this repeats until the last day of the month.
I need it to use each id for every day of the month.
code:
my_dates <- seq(as.Date("2020-03-01"), as.Date("2020-03-31"), by = 1)
my_fetch <- function(unit, unit_id, d) {
df <- google_analytics(unit_id,
date_range = c(d, d),
metrics = c("totalEvents"),
dimensions = c("ga:date", "ga:eventCategory", "ga:eventAction", "ga:eventLabel"),
anti_sample = TRUE)
df$unidad_de_negocio <- unit
filename <- paste0(unit, "-", "total-events", "-", d, ".csv")
path <- "D:\\america\\costos_protv\\total_events"
write.csv(df, file.path(path, filename), row.names = FALSE)
print(filename)
rm(df)
gc()
}
monthly_fetches <- mapply(my_fetch, df$unit,
df$unit_id,
my_dates, SIMPLIFY = FALSE)
Variation 2: By monthly ranges
Thank you, Akrun. Your answer works.
I'ven trying to edit it, ot use it in this other similar scenario:
1.- Monthly starts and ends: Now the loops isn't a single day date, but has an start and end. I've called this monthly_dates
| starts | ends |
|:-----------:|------------|
| 2020-02-01 | 2020-02-29 |
| 2020-03-01 | 2020-03-31 |
I've tried to adapt the solution, but it is not working. May you see it and tell me why? Thank you.
monthly_fetches <- Map(function(x, y)
lapply(monthly_dates, function(d1, d2) my_fetch(x, y, monthly_dates$starts, monthly_dates$ends)))
Main function adapted to use 2 dates (start "d1" and end "d2"):
my_fetch <- function(udn, udn_id, d1, d2) {
df <- google_analytics(udn_id,
date_range = c(d1, d2),
metrics = c("totalEvents"),
dimensions = c("ga:month"),
anti_sample = TRUE)
df$udn <- udn
df$udn_id <- udn_id
df
}
** Code to make the monthly date ranges:**
make_date_ranges <- function(start, end){
starts <- seq(from = start,
to = Sys.Date()-1 ,
by = "1 month")
ends <- c((seq(from = add_months(start, 1),
to = end,
by = "1 month" ))-1,
(Sys.Date()-1))
data.frame(starts,ends)
}
## useage
monthly_dates <- make_date_ranges(as.Date("2020-02-01"), Sys.Date())
Update 1:
dput(monthly_fetches[1])
list(AE = list(structure(list(month = "02", totalEvents = 19670334,
udn = "AE", udn_id = 74415341), row.names = 1L, totals = list(
list(totalEvents = "19670334")), minimums = list(list(totalEvents = "19670334")), maximums = list(
list(totalEvents = "19670334")), isDataGolden = TRUE, rowCount = 1L, class = "data.frame"),
structure(list(month = "03", totalEvents = 19765253, udn = "AE",
udn_id = 74415341), row.names = 1L, totals = list(list(
totalEvents = "19765253")), minimums = list(list(totalEvents = "19765253")), maximums = list(
list(totalEvents = "19765253")), isDataGolden = TRUE, rowCount = 1L, class = "data.frame"),
structure(list(month = "04", totalEvents = 1319087, udn = "AE",
udn_id = 74415341), row.names = 1L, totals = list(list(
totalEvents = "1319087")), minimums = list(list(totalEvents = "1319087")), maximums = list(
list(totalEvents = "1319087")), isDataGolden = TRUE, rowCount = 1L, class = "data.frame")))
Update 2:
dput(monthly_fetches[[1]])
list(structure(list(month = "02", totalEvents = 19670334, udn = "AE",
udn_id = 74415341), row.names = 1L, totals = list(list(totalEvents = "19670334")), minimums = list(
list(totalEvents = "19670334")), maximums = list(list(totalEvents = "19670334")), isDataGolden = TRUE, rowCount = 1L, class = "data.frame"),
structure(list(month = "03", totalEvents = 19765253, udn = "AE",
udn_id = 74415341), row.names = 1L, totals = list(list(
totalEvents = "19765253")), minimums = list(list(totalEvents = "19765253")), maximums = list(
list(totalEvents = "19765253")), isDataGolden = TRUE, rowCount = 1L, class = "data.frame"),
structure(list(month = "04", totalEvents = 1319087, udn = "AE",
udn_id = 74415341), row.names = 1L, totals = list(list(
totalEvents = "1319087")), minimums = list(list(totalEvents = "1319087")), maximums = list(
list(totalEvents = "1319087")), isDataGolden = TRUE, rowCount = 1L, class = "data.frame"))