27

This is a newbie question in R. I am downloading yahoo finance monthly stock price data using R where the ticker names are read from a text file. I am using a loop to read the ticker names to download the data and putting them in a list. My problem is some ticker names may not be correct thus my code stops when it encounters this case. I want the following.

  1. skip the ticker name if it is not correct.
  2. Each element in the list is a dataframe. I want the ticker names to be appended to variable names in element dataframes.
  3. I need an efficient way to create a dataframe that has the closing prices as variables.

Here is the sample code for the simplified version of my problem.

library(tseries)  
tckk <- c("MSFT", "C", "VIA/B", "MMM") # ticker names defined  
numtk <- length(tckk);  
ustart <- "2000-12-30";
uend <- "2007-12-30" # start and end date  
all_dat <- list(); # empty list to fill in the data  
for(i in 1:numtk)  
{  
  all_dat[[i]] <- xxx <- get.hist.quote(instrument = tckk[i], start=ustart, end=uend, quote = c("Open", "High", "Low", "Close"), provider = "yahoo", compression = "m")  
}   

The code stops at the third entry but I want to skip this ticker and move on to "MMM". I have heard about Trycatch() function but do not know how to use it.

As per question 2, I want the variable names for the first element of the list to be "MSFTopen", "MSFThigh", "MSFTlow", and "MSFTclose". Is there a better to way to do it apart from using a combination of loop and paste() function.

Finally, for question 3, I need a dataframe with three columns corresponding to closing prices. Again, I am trying to avoid a loop here.

Thank you.

Peter Hall
  • 53,120
  • 14
  • 139
  • 204
user227290
  • 1,509
  • 3
  • 18
  • 26

8 Answers8

22

Your best bet is to use quantmod and store the results as a time series (in this case, it will be xts):

library(quantmod)
library(plyr)
symbols <- c("MSFT","C","VIA/B","MMM")

#1
l_ply(symbols, function(sym) try(getSymbols(sym))) 
symbols <- symbols[symbols %in% ls()]

#2
sym.list <- llply(symbols, get) 

#3
data <- xts()
for(i in seq_along(symbols)) {
    symbol <- symbols[i]
    data <- merge(data, get(symbol)[,paste(symbol, "Close", sep=".")])
}
Contango
  • 76,540
  • 58
  • 260
  • 305
Shane
  • 98,550
  • 35
  • 224
  • 217
  • Thank you for the code. It works. However, originally I am reading the ticker names from a file that has more than 1000 tickers. So hard coding with the merge() may not help me. Moreover, I do want to put them in a list so that I can use plyr library to do other stuff to each list element. – user227290 Aug 18 '10 at 00:15
  • Ok, well it's easy to do both of those things. Just loop over the symbols to merge them (rather than hard coding them). And store then in a list with plyr: `llply(symbol.names, get)`. – Shane Aug 18 '10 at 00:25
  • 6
    One note: I much prefer `auto.assign=FALSE` in `getSymbols()` as it avoids the insanity of filling the environment with all those symbols when all I want is the merged `data.frame`. – Dirk Eddelbuettel Aug 18 '10 at 01:20
  • 1
    You could further clean up the merge with: `merge(data, Cl(get(symbol)))` – Joshua Ulrich Aug 18 '10 at 02:54
  • @Shane I've just added "library(plyr)", as this is required (like quantmod) on a fresh installation of R. – Contango Apr 10 '13 at 15:02
  • With the latest version of quantmod, the line "l_ply(symbols, function(sym) try(getSymbols(sym)))" fails sometimes (perhaps due to internal multithreading breaking the HTTP fetch?), replacing with "getSymbols(symbols)" will fix any unstability. – Contango Apr 10 '13 at 15:28
  • Building on the suggestion from @Joshua Ulrich, if you prefer the adjusted close, use "merge(data, Ad(get(symbol)))". – Contango Apr 10 '13 at 15:32
  • `l_ply(symbols, function(sym) try(getSymbols(sym)))` Error in download.file(paste(yahoo.URL, "s=", Symbols.name, "&a=", from.m, : cannot open URL 'http://ichart.finance.yahoo.com/table.csv?s=VIA/B&a=0&b=01&c=2007&d=6&e=11&f=2016&g=d&q=q&y=0&z=VIA/B&x=.csv' In addition: Warning message: In download.file(paste(yahoo.URL, "s=", Symbols.name, "&a=", from.m, : cannot open URL 'http://ichart.finance.yahoo.com/table.csv?s=VIA/B&a=0&b=01&c=2007&d=6&e=11&f=2016&g=d&q=q&y=0&z=VIA/B&x=.csv': HTTP status was '404 Not Found' – Manoj Kumar Jul 11 '16 at 06:11
20

This also a little late...If you want to grab data with just R's base functions without dealing with any add-on packages, just use the function read.csv(URL), where the URL is a string pointing to the right place at Yahoo. The data will be pulled in as a dataframe, and you will need to convert the 'Date' from a string to a Date type in order for any plots to look nice. Simple code snippet is below.

URL <- "http://ichart.finance.yahoo.com/table.csv?s=SPY"
dat <- read.csv(URL)
dat$Date <- as.Date(dat$Date, "%Y-%m-%d")

Using R's base functions may give you more control over the data manipulation.

Contango
  • 76,540
  • 58
  • 260
  • 305
stotastic
  • 796
  • 1
  • 5
  • 18
  • +1 This is a good tip. May be I am wrong but I guess it will have less flexibility to change the start date, end date, and data frequency. The URL will get more messy with that. – user227290 Aug 18 '10 at 19:37
  • A few thoughts on this, you can stick all this in a utility function that constructs the URL, requests the data, and returns the data frame (there are a bunch more parameters you can send to yahoo in the url like the start date)...also since the data is in a data.frame, its pretty easy to filter and manipulate the data you don't want. Also, the 'merge' command is VERY useful in combining data frames of multiple tickers. – stotastic Aug 18 '10 at 21:11
  • @stotastic There are already several functions that do what you describe (`quantmod::getSymbols`, `TTR::getYahooData`, `tseries::get.hist.quote`, `fImport::yahooImport`, etc). – Joshua Ulrich Aug 19 '10 at 11:45
  • 1
    All nice functions that add a simplifying layer of abstraction, but I think its important to know the basics of how these functions pull in the data... just using a 'back box' can be dangerous – stotastic Aug 19 '10 at 14:11
  • Yes, it's helpful to understand how the functions work. However, you don't need to write a similar function to understand how they're pulling data. You can simply look at the source code... which isn't really a "black box". – Joshua Ulrich Aug 19 '10 at 18:48
  • 3
    I guess it is a matter of taste. – stotastic Aug 19 '10 at 19:57
  • @stotasic I hope you don't mind, but I've just edited your answer to change "%m/%d/%Y" to "%Y-%m-%d", so the example code works with the latest version of the Yahoo API. And I've upvoted as its a great answer. – Contango Apr 10 '13 at 14:54
9

I'm a little late to the party, but I think this will be very helpful to other late comers.

The stockSymbols function in TTR fetches instrument symbols from nasdaq.com, and adjusts the symbols to be compatible with Yahoo! Finance. It currently returns ~6,500 symbols for AMEX, NYSE, and NASDAQ. You could also take a look at the code in stockSymbols that adjusts tickers to be compatible with Yahoo! Finance to possibly adjust some of the tickers in your file.

NOTE: stockSymbols in the version of TTR on CRAN is broken due to a change on nasdaq.com, but it is fixed in the R-forge version of TTR.

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
  • +1 for this great tip. Just wanted to know if there is a quick way to extract ticker names for all companies registered in NYSE and save it in a column in a flat file. I tried using it but didn't know how to extract the ticker names from the output object of stockSymbols function call. Thanks. – user227290 Aug 18 '10 at 19:33
  • 1
    `write.csv(stockSymbols("NYSE")$Symbol,"NYSE_symbols.txt",row.names=FALSE,col.names=FALSE)` – Joshua Ulrich Aug 18 '10 at 19:48
  • Thank you for the tip. I did try the stockSymbols("NYSE")$Symbol command earlier but it outputs strange things, not the ticker names. I guess it must be due to CRAN version of TTR (I used install.packages() command to in stall TTR) – user227290 Aug 18 '10 at 19:59
  • 1
    Yes, you probably have the CRAN version. You can use `install.packages` to install the R-forge version: `install.packages("TTR",repos="http://r-forge.r-project.org")`. – Joshua Ulrich Aug 18 '10 at 20:06
  • @Joshua Ulrich I installed the `R-forge` version and then got: `> write.csv( stockSymbols("NYSE")$Symbol, file="NYSE_symbols.csv", row.names=FALSE, col.names=FALSE ) ###### Fetching NYSE symbols... ###### Warning messages: ###### 1: In write.csv(stockSymbols("NYSE")$Symbol, file = "NYSE_symbols.csv", : attempt to set 'col.names' ignored ###### 2: In read.table(file = file, header = header, sep = sep, quote = quote, : cols = 3 != length(data) = 4` – isomorphismes Jul 01 '11 at 11:04
  • 1
    @Lao Tzu: please see `?write.csv`. The "CSV Files" section says `write.csv` and `write.csv2` are "deliberately inflexible" and "Attempts to change ‘append’, ‘col.names’, ‘sep’, ‘dec’ or ‘qmethod’ are ignored, with a warning." – Joshua Ulrich Jul 01 '11 at 12:26
  • @Joshua Ulrich Thank you. Then I don't understand your comment above of `Aug 18 '10 at 19:48`. – isomorphismes Jul 01 '11 at 22:48
  • 1
    @Lao Tzu: I probably wrote that code from memory (i.e. didn't test it) and intended to use `write.table` instead of `write.csv`. `write.table` will respect the `col.names=FALSE` argument and there's only one column so the delimiter doesn't matter. – Joshua Ulrich Jul 01 '11 at 23:35
3

I do it like this, because I need to have the historic pricelist and a daily update file in order to run other packages:

library(fImport)

fecha1<-"03/01/2009"
fecha2<-"02/02/2010"

Sys.time()

y <- format(Sys.time(), "%y")    
m <- format(Sys.time(), "%m")    
d <- format(Sys.time(), "%d")
fecha3 <- paste(c(m,"/",d,"/","20",y), collapse="")

write.table(yahooSeries("GCI", from=fecha1, to=fecha2), file = "GCI.txt", sep="\t", quote = FALSE, eol="\r\n", row.names = TRUE)
write.table(yahooSeries("GCI", from=fecha2, to=fecha3), file = "GCIupdate.txt", sep="\t", quote = FALSE, eol="\r\n", row.names = TRUE)

GCI <- read.table("GCI.txt") 
GCI1 <- read.table("GCIupdate.txt")
GCI <- rbind(GCI1, GCI)
GCI <- unique(GCI)

write.table(GCI, file = "GCI.txt", sep="\t", quote = FALSE, eol="\r\n", row.names = TRUE)
Peter Hall
  • 53,120
  • 14
  • 139
  • 204
1

Slightly modified from the above solutions... (thanks Shane and Stotastic)

 symbols <- c("MSFT", "C", "MMM")

 # 1. retrieve data

 for(i in seq_along(symbols)) {
   URL <- paste0("http://ichart.finance.yahoo.com/table.csv?s=", symbols[i])
   dat <- read.csv(URL)
   dat$Date <- as.Date(dat$Date, "%Y-%m-%d")
   assign(paste0(symbols[i]," _data"), dat)
   dat <- NULL
 }
Peter Hall
  • 53,120
  • 14
  • 139
  • 204
Manoj Kumar
  • 5,273
  • 1
  • 26
  • 33
1

If your ultimate goal is to get the data.frame of three columns of closing prices, then the new package tidyquant may be better suited for this.

library(tidyquant)

symbols <- c("MSFT", "C", "VIA/B", "MMM")

# Download data in tidy format. 
# Will remove VIA/B and warn you.
data <- tq_get(symbols)

# Ticker symbols as column names for closing prices
data %>% 
    select(.symbol, date, close) %>% 
    spread(key = .symbol, value = close)

This will scale to any number of stocks, so the file of 1000 tickers should work just fine!

Davis Vaughan
  • 2,780
  • 9
  • 19
0

Unfortunately, URL "ichart.finance.yahoo.com" is dead and not working now. As I know, Yahoo closed it and it seems it will not be opened.

Several days ago I found nice alternative (https://eodhistoricaldata.com/) with an API very similar to Yahoo Finance.

Basically, for R-script described above you just need to change this part:

URL <- paste0("ichart.finance.yahoo.com/table.csv?s=", symbols[i])

to this:

URL <- paste0("eodhistoricaldata.com/api/table.csv?s=", symbols[i])

Then add an API key and it will work in the same way as before. I saved a lot of time for my R-scripts on it.

Peter Hall
  • 53,120
  • 14
  • 139
  • 204
0

Maybe give the BatchGetSymbols library a try. What I like about it over quantmod is that you can specify a time period for your data.

library(BatchGetSymbols)

# set dates
first.date <- Sys.Date() - 60
last.date <- Sys.Date()
freq.data <- 'daily'
# set tickers
tickers <- c('FB','MMM','PETR4.SA','abcdef')

l.out <- BatchGetSymbols(tickers = tickers, 
                         first.date = first.date,
                         last.date = last.date, 
                         freq.data = freq.data,
                         cache.folder = file.path(tempdir(), 
                                                  'BGS_Cache') ) # cache in tempdir()
J du Preez
  • 145
  • 9