-1

i want to read out several csv files from the web and save the data into a data frame. If the files were on my computer this would be very easy as I have seen but I don't always want to download the files.

The example:

  "https://www.football-data.co.uk/mmz4281/1819/F1.csv",
  "https://www.football-data.co.uk/mmz4281/1718/F1.csv",
  "https://www.football-data.co.uk/mmz4281/1617/F1.csv",
  "https://www.football-data.co.uk/mmz4281/1516/F1.csv",
  "https://www.football-data.co.uk/mmz4281/1415/F1.csv",
  "https://www.football-data.co.uk/mmz4281/1314/F1.csv",
  "https://www.football-data.co.uk/mmz4281/1213/F1.csv",
  "https://www.football-data.co.uk/mmz4281/1112/F1.csv",
  "https://www.football-data.co.uk/mmz4281/1011/F1.csv"

These are the CSV files. maybe its possible with a function or a loop but i dont know how.

Maybe you can help me.

Greetings

pontilicious
  • 239
  • 2
  • 12

1 Answers1

0

Reading files from the web is just as easy as reading them from your file system; you can just pass a URL instead of a file-path to readr::read_csv() (you tagged your question with readr so I assume you want to use that).

Assuming your files are in a vector:

files <- c("https://www.football-data.co.uk/mmz4281/1819/F1.csv",
"https://www.football-data.co.uk/mmz4281/1718/F1.csv",
"https://www.football-data.co.uk/mmz4281/1617/F1.csv",
"https://www.football-data.co.uk/mmz4281/1516/F1.csv",
"https://www.football-data.co.uk/mmz4281/1415/F1.csv",
"https://www.football-data.co.uk/mmz4281/1314/F1.csv",
"https://www.football-data.co.uk/mmz4281/1213/F1.csv",
"https://www.football-data.co.uk/mmz4281/1112/F1.csv",
"https://www.football-data.co.uk/mmz4281/1011/F1.csv")

You can use readr::read_csv to read a specific file, and combine them into one data-frame with purrr::map_dfr:

df <- purrr::map_dfr(files, readr::read_csv)

This iterates over the contents of files, applies readr::read_csv to each of those elements, and combines them into one data frame, rowwise (hence dfr).

Bas
  • 4,628
  • 1
  • 14
  • 16
  • i have another question about this. is it possible to use `purrr::map_dfr(files, readr::read_csv)` to read only certain columns from the CSV and merge them into one data frame? I have the feeling you have to read all CSVs first and then edit the DF to get the desired columns. The problem is that some columns cannot be merged because they contain different data types. "Character" and "Double" as an example. I want to work around this. – pontilicious Sep 11 '20 at 15:02
  • In `read_csv` you can specify the column types via the `col_types` argument. See [here](https://readr.tidyverse.org/reference/read_delim.html) and [here](https://readr.tidyverse.org/articles/readr.html) for example. That should fix your issue. – Bas Sep 12 '20 at 08:58
  • thanks but for me it would be best to import only specific columns. so i created a vector with the names of the columns `col.names` but when i try to import it via `data <- map_dfr(files, read.csv(cols_only(col.names)))` there is an error. ```Error in switch(x, `_` = , `-` = col_skip(), `?` = col_guess(), c = col_character(), : EXPR must be a length 1 vector``` My head explodes :-( – pontilicious Sep 12 '20 at 14:01
  • You would need to use `map_dfr(files, read_csv, cols_only = col.names)` – Bas Sep 13 '20 at 11:02