1

I am sorry, the functional programming "loops" got me a bit in of head scratching about purrr.

I know that to use own, non-vectorised functions one can use map_chr() and I used it together with mutate to produce 2 new columns. But at one point I did not understand if map_chr every time takes the whole column and produces list output every time or just takes the one value and places that one computed output value in the new variable. Basically - if for every variable in the SHASUM column the map_chr returns just one value, or a list of values from which the correct value is automagically picked? I am sorry the question is so fuzzy, but I found it hard to understand, not knowing what is going on inside pipes and mutate.

My example code below.

Is this a valid/correct use of map_chr() (and more generally map functions from purrr) or is there something better that I should have done?

library(tidyverse)
library(lubridate)
library(urlshorteneR)

longLinkBase <- "https://lime.survey.server/334443?newtest=Y?&QID="

initData <- structure(list(SHASUM = c("4db194d", "44fc459", "eb81eb4", "3c37606", "1165fc2", "fd4f56b"), StartDate = c(44172L, 44172L, 44172L, 44172L, 44172L, 44172L)), row.names = c(NA, 6L), class = "data.frame")
# convert Excel date serial number into proper date 
initData$StartDate <- as.Date(initData$StartDate, origin = "1899-12-30")

getlongLink <- function(x,y){
  # combine long link base with the (subject) code 
  z <- URLencode(paste0(y,x))
  return(z)
}

getShortLink <- function(x){
  Sys.sleep(2)
  z <- isgd_LinksShorten(x)
  return(z)
}

# 3 Lines below are my question, really:
initData <- initData %>% 
  mutate(longLink = map_chr(SHASUM,getlongLink,y=longLinkBase))  %>% 
  mutate(shortLink = map_chr(longLink,getShortLink))

### Write out data as CSV file
write.csv(initData,file=paste0("./output/","shortLinks_",format(Sys.time(),"%Y-%m-%d_%H-%M_%b"),".csv"),na="")
r0berts
  • 842
  • 1
  • 13
  • 27

1 Answers1

3

map_chr is a hidden loop. The benefit of using it is the code can be piped/chained together and it is better for readability.

map_chr(SHASUM,getlongLink,y=longLinkBase) is same as doing -

getlongLink(initData$SHASUM[1], longLinkBase)
getlongLink(initData$SHASUM[2], longLinkBase)
getlongLink(initData$SHASUM[3], longLinkBase)
.....
.....

The difference is that you don't 'see' these individual calls and they are performed under the hood in map_chr. Each of this call returns a single value which is stored as a new column.

I think your code overall is good, I can suggest only one small improvement that you can combine the two mutate call into a single one.

initData <- initData %>% 
  mutate(longLink = map_chr(SHASUM,getlongLink,y=longLinkBase),
         shortLink = map_chr(longLink,getShortLink))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thanks Ronak, I did not know that it was possible to do it in one `mutate` call; theoretically - can you put as many columns (at least some depending on previous ones) as needed in one call? – r0berts May 16 '21 at 17:50
  • 1
    Yes, as you can see here we are referring to `longLink` from the previous call to create `shortLink` in the same `mutate` call. You can add as many columns as you like here. – Ronak Shah May 17 '21 at 00:04