I apologize if this question was poorly worded, but after hours of searching the web I feel confident in saying this question has not been answered previously. I will do my best to describe in detail exactly what this problem entails.
Data-set summary:
The data being used is financial data (Open, High, Low, Close) that was retrieved from python code and stored within individual CSV documents. Using lapply
, the documents were then read and stored. To keep things simple, all I am focusing on currently is daily percentage change, or (Close/shift(Close))-1. For purposes of this problem, I have removed all NA
s as well as non-complete tickers from the data.
I have a data frame (converted from list) of 98 columns (the tickers), spanning 1000 rows (the days). The values within the data frame/ matrix are the daily percentage changes for each ticker, on each day.
Objective:
I want to know how to apply the lm()
formula over each column through dynamically referencing the column name, using ALL other columns (~ .
).
Sample data set:
aapl_pct_chg <- c(.02, .03, .01, -.05, -.01)
tmus_pct_chg <- c(-.01, -.02, .05, .01, -.03)
akam_pct_chg <- c(.1, -.2, .3, -.03, -.07)
intc_pct_chg <- c(.01, .03, .02, .01, .1)
de_pct_chg <- c(-.01, -.05, .05, .1, -.03)
df <- as.data.frame(cbind(aapl_pct_chg, tmus_pct_chg, akam_pct_chg, intc_pct_chg, de_pct_chg))
names(df) <- c("AAPL", "TMUS", "AKAM", "INTC", "DE")
It is simple enough to do the following:
lm_aapl <- lm(AAPL ~ ., data=df)
But I have been unable to find a way to DYNAMICALLY reference the column name without running into errors. What I mean by this is that, ideally, I could run one formula that will capture the lm()
model on each column, using every other column.
There are some answered questions that have HELPED (and I apologize, I am unorganized and have tried this in 500 different ways), but none that have solved it. The closest I have come is a formula that does what I want, but it will include AAPL's values when predicting AAPL -- which leads to a good model but not what I want.