2

I would like to get a list of all the files available at this address: http://www1.ncdc.noaa.gov/pub/data/cmb/drought/weekly-palmers/2005/ (publicly available data from the NOAA).

It would be some sort of "list.files" for the a specific URL. I started to take a look at RCurl but all I could get was the HTML code of the URL.

Sergey K.
  • 24,894
  • 13
  • 106
  • 174
user1752610
  • 75
  • 2
  • 6

1 Answers1

4

In this case you can simply use readHTMLTable:

readHTMLTable("http://www1.ncdc.noaa.gov/pub/data/cmb/drought/weekly-palmers/2005/", 
              skip.rows=1:2)[[1]]$Name -> file.list

Then to create a list of paths:

paste("http://www1.ncdc.noaa.gov/pub/data/cmb/drought/weekly-palmers/2005/", 
      file.list[!is.na(file.list)], sep="") -> path.list
plannapus
  • 18,529
  • 4
  • 72
  • 94
  • This doesn't seem to work here: `https://vip.arizona.edu/vipdata/V4/DATAPOOL/PHENOLOGY/`. Calling: `readHTMLTable("https://vip.arizona.edu/vipdata/V4/DATAPOOL/PHENOLOGY/", skip.rows=1:2)[[1]]$Name -> file.list` returns: `Error in XML::readHTMLTable("https://vip.arizona.edu/vipdata/V4/DATAPOOL/PHENOLOGY/", : subscript out of bounds In addition: Warning message: XML content does not seem to be XML: 'https://vip.arizona.edu/vipdata/V4/DATAPOOL/PHENOLOGY/' ` – colin Sep 21 '18 at 15:13