1

I'm trying to write a function which will download and load files from IMDb dataset page available here: https://datasets.imdbws.com/

Problem is that function execution is downloading the file, but loading it to any object.

I have created separate steps code, which is working fine.

url <- "https://datasets.imdbws.com/name.basics.tsv.gz"
tmp <- tempfile()
download.file(url, tmp)

name_basics <- readr::read_tsv(
  file = gzfile(tmp),
  col_names = TRUE, 
  quote = "",
  na = "\\N",
  progress = FALSE
)

File is downloaded and loaded to the name_basics. But when I tried to create function code, then there is no data loaded. What I have done wrong?

Function code

imdbTSVfiles <- function(fileName){
  url <- paste0("https://datasets.imdbws.com/",fileName,".tsv.gz")
  tmp <- tempfile()
  download.file(url, tmp)

  name <- readr::read_tsv(
      file = gzfile(tmp),
      col_names = TRUE,
      quote = "",
      na = "\\N")
}

imdbTSVfiles("name.basics")

Expected result: provided file name downloaded and loaded.

Supek
  • 47
  • 1
  • 7
  • It is downloaded, but `name` is a local variable to the function. You should `return(name)` to get it as a result. Then you can assign the value of the function: `result <- imdbTSVfiles('name.basics')` – jake2389 Sep 05 '19 at 19:54

1 Answers1

0

You need to store data to dynamic named variable, which can be easily achieved using assign().

imdbTSVfiles <- function(fileName){
  url <- paste0("https://datasets.imdbws.com/",fileName,".tsv.gz")
  tmp <- tempfile()
  download.file(url, tmp)

  assign(fileName,
         readr::read_tsv(
           file = gzfile(tmp),
           col_names = TRUE,
           quote = "",
           na = "\\N"),
  envir = .GlobalEnv)
}

imdbTSVfiles("name.basics")

This should store data in name.basics variable.

Rushabh Patel
  • 2,672
  • 13
  • 34
  • 1
    Great tip! I was able to fully automate download and load of the files. You're da MAN! – Supek Sep 05 '19 at 20:24