0

I'm struggling to download the stock price in R.

Here are my questions.

  1. I used getMasterIndex to download the MasterIndex of all SEC filings and filter 10-k reports (I only need 10-k reports for analysis)
  2. I need to use the company.name from the MasterIndex to download the stock price for each company at the previous day of filing date (which means I need to get the stock price for companies at different dates). But I failed by using the getSymbol function since it requires tickers instead of company names. So I need to first transfer the company.name to tickers? and then download the stock price based on the corresponding (filing date-1)?

But is there any way to directly download the stock price from the company.name of MasterIndex? I attached the previous code and the getsymbol code below. Please help me out of this problem. Much thanks!

master_indexes <- list.files("Master Indexes/",pattern="Rda")
all_indexes <- data.frame()

for(master_index in master_indexes){
  load(paste0("Master Indexes/",master_index))
  this_index <- year.master 
  all_indexes <- bind_rows(all_indexes,this_index)
  print(master_index)
}

head(all_indexes) 
company_filing_info <- all_indexes %>% 
  filter(form.type =="10-K") 

companies <- company_filing_info$company.name

library(tidyquant)

for (i in companies) {
  getSymbols(companies, 
           src = "yahoo",
           auto.assign = TRUE)
}

  • Are you able to obtain the tickers somehow? – Chamkrai Jun 28 '22 at 22:26
  • Hi Tom, I tried to obtain tickers and matched with the original company names. But it always fails so I wonder is there any other way to get stock prices? Thanks! – Tiffany Guo Jun 29 '22 at 08:59
  • I do not know of any other methods, sorry. But I have a dataframe with company names and tickers on it, `https://drive.google.com/file/d/1TRX6Q2rwze55qbrg9cuDGS34Mt1IXSEg/view?usp=sharing`. Perhaps you can merge it with this one? If not, can you share `dput(head(df))` of yours – Chamkrai Jun 29 '22 at 12:17
  • Hi Tom, I tried this code but it keeps reporting error. I posted a new question about this. If you have time, could you please check it because I think you might know how to solve this error. Thank you so much for providing the stock tickers file!!! – Tiffany Guo Jul 21 '22 at 10:02

1 Answers1

0

A possible solution. The issue is that the way edgar output company names is different from others for some reason.

My example data with SEC Filings in 2022

edgar::getMasterIndex(2022, "myemail-2022@gmail.com")
load("Master Indexes/2022master.Rda")

df <- year.master %>%  
  as_tibble() %>% 
  filter(form.type == "10-K") %>%  
  mutate(company.name = company.name %>% 
           str_squish())

# A tibble: 6,906 x 6
   cik     company.name         form.type date.filed edgar.link quarter
   <chr>   <chr>                <chr>     <chr>      <chr>        <int>
 1 1000209 MEDALLION FINANCIAL~ 10-K      2022-03-14 edgar/dat~       1
 2 1000228 HENRY SCHEIN INC     10-K      2022-02-15 edgar/dat~       1
 3 1000229 CORE LABORATORIES N~ 10-K      2022-02-10 edgar/dat~       1
 4 1000298 IMPAC MORTGAGE HOLD~ 10-K      2022-03-11 edgar/dat~       1
 5 1000623 SCHWEITZER MAUDUIT ~ 10-K      2022-03-01 edgar/dat~       1
 6 1000683 BLONDER TONGUE LABO~ 10-K      2022-03-31 edgar/dat~       1
 7 1000694 NOVAVAX INC          10-K      2022-03-01 edgar/dat~       1
 8 1000697 WATERS CORP DE       10-K      2022-02-24 edgar/dat~       1
 9 1000753 INSPERITY INC        10-K      2022-02-11 edgar/dat~       1
10 1001082 DISH Network CORP    10-K      2022-02-24 edgar/dat~       1
# ... with 6,896 more rows

Now, download the .Rdata from my Drive will give you a data frame of stocks, which has company names and their respective tickers

# A tibble: 6,255 x 4
   symbol company_name                industry               market_cap
   <chr>  <chr>                       <chr>                  <chr>     
 1 A      Agilent Technologies        Life Sciences Tools &~ 34.97B    
 2 AA     Alcoa                       Metals & Mining        9.08B     
 3 AAC    Ares Acquisition            Blank Check / SPAC     1.23B     
 4 AACG   ATA Creativity Global       Diversified Consumer ~ 35.23M    
 5 AACI   Armada Acquisition I        Blank Check / SPAC     204.51M   
 6 AADI   Aadi Bioscience             Biotechnology          249.63M   
 7 AAIC   Arlington Asset Investment  Mortgage Real Estate ~ 96.11M    
 8 AAL    American Airlines Group     Airlines               8.77B     
 9 AAMC   Altisource Asset Management Real Estate Managemen~ 22.16M    
10 AAME   Atlantic American           Insurance              55.09M    
# ... with 6,245 more rows

Merge stocks with df

stocks %>%  
  merge(., df, by.x = "company_name", by.y = "company.name") %>% 
  as_tibble()

# A tibble: 101 x 9
   company_name   symbol industry market_cap cik   form.type date.filed
   <chr>          <chr>  <chr>    <chr>      <chr> <chr>     <chr>     
 1 American Home~ AMH    Equity ~ 14.18B     1562~ 10-K      2022-02-25
 2 Ameris Bancorp ABCB   Banks    2.86B      3515~ 10-K      2022-02-28
 3 Aon plc        AON    Insuran~ 56.42B     3152~ 10-K      2022-02-18
 4 Apollo Strate~ APGB   Shell C~ 848.70M    1838~ 10-K      2022-03-17
 5 Bank of Marin~ BMRC   Banks    515.27M    1403~ 10-K      2022-03-15
 6 Beam Global    BEEM   Electri~ 149.82M    1398~ 10-K      2022-03-31
 7 Blue Foundry ~ BLFY   Banks-R~ 347.40M    1846~ 10-K      2022-03-14
 8 California Ba~ CALB   Banks    158.12M    1752~ 10-K      2022-03-23
 9 CC Neuberger ~ PRPB   Shell C~ 1.08B      1812~ 10-K      2022-03-01
10 CC Neuberger ~ PRPC   Blank C~ 544.28M    1821~ 10-K      2022-03-31
# ... with 91 more rows, and 2 more variables: edgar.link <chr>,
#   quarter <int>

library(tidyquant)

final_df <- merged_df %>%  
  mutate(date.filed = date.filed %>% as.Date(), 
         price_before_filing = map2(.x = symbol, 
                                    .y = date.filed, 
                                    ~ tq_get(.x, from = as.Date(.y) - 1) %>% 
                                      slice(1) %>% 
                                      pull(adjusted)) %>% 
           as.numeric()) %>% 
  select(symbol, company_name, price_before_filing, everything())

# A tibble: 101 x 10
   symbol company_name       price_before_fi~ industry market_cap cik  
   <chr>  <chr>                         <dbl> <chr>    <chr>      <chr>
 1 AMH    American Homes 4 ~            37.4  Equity ~ 14.18B     1562~
 2 ABCB   Ameris Bancorp                49.3  Banks    2.86B      3515~
 3 AON    Aon plc                      281.   Insuran~ 56.42B     3152~
 4 APGB   Apollo Strategic ~             9.75 Shell C~ 848.70M    1838~
 5 BMRC   Bank of Marin Ban~            35.1  Banks    515.27M    1403~
 6 BEEM   Beam Global                   18.4  Electri~ 149.82M    1398~
 7 BLFY   Blue Foundry Banc~            13.6  Banks-R~ 347.40M    1846~
 8 CALB   California BanCorp            22.4  Banks    158.12M    1752~
 9 PRPB   CC Neuberger Prin~             9.89 Shell C~ 1.08B      1812~
10 PRPC   CC Neuberger Prin~             9.85 Blank C~ 544.28M    1821~
# ... with 91 more rows, and 4 more variables: form.type <chr>,
#   date.filed <date>, edgar.link <chr>, quarter <int>
Chamkrai
  • 5,912
  • 1
  • 4
  • 14