You have multiple issues here that are squarely in the area of "data wrangling". The biggest issue is to impute actual values into your missing value fields.
Luckily, the xts
time series library contains functions to do this, as well as a function to plot multiple time series, which is your ultimate goal.
However, before we can use those wonderful functions, you will have to do some work transforming your data into an xts
object.
First recreating your data above using the method of @aelwan.
```{r, tidy=TRUE}
df <- read.table(text = c("
CompanyA NA 1000 NA NA NA 1000
CompanyB 600 NA NA NA 600 NA
CompanyC NA 5000 NA 5000 NA NA"),
header = F)
colnames(df) <- c("CompanyName", "2001-01", "2001-02" ,"2001-03", "2001-04", "2001-05", "2001-06")
df
CompanyName 2001-01 2001-02 2001-03 2001-04 2001-05 2001-06
1 CompanyA NA 1000 NA NA NA 1000
2 CompanyB 600 NA NA NA 600 NA
3 CompanyC NA 5000 NA 500 NA NA
Your data appears to be in wide format, so I would suggest transposing it to long format. This will require a few steps to retain important information such as column and row names, as well as the class of your data (numeric).
First, transpose the data frame
df_t <- t(df)
Now, save the first row, which now contains the company names.
company_names <- df_t[1,]
The transpose process results in an object of class 'matrix'. Drop the first row and make df_t object class data.frame.
df_t <- data.frame(df_t[-1, ], stringsAsFactors = FALSE)
Add the company names stored in "company_names" back as the column names
colnames(df_t) <- company_names
Your column data class might have been lost during the transpose as well, so convert all column to class numeric with the sapply
function.
df_long <- data.frame(sapply(df_t, FUN=as.numeric), row.names = rownames(df_t))
# print the long form results
df_long
CompanyA CompanyB CompanyC
Jan 2001 NA 600 NA
Feb 2001 1000 NA 5000
Mar 2001 NA NA NA
Apr 2001 NA NA 5000
May 2001 NA 600 NA
Jun 2001 1000 NA NA
Now, convert your new df_long
data.frame into a time series index based xts
object to access the time series function you need.
{r}
library(xts)
# convert rownames "2001-01, 2001-02, ..." to yearmon format
rownames(df_long) <- as.yearmon(rownames(df_long), "%Y-%m")
# pass the dates as an index to the xts via the `order.by` argument.
df_xts <- xts(df_long , order.by = as.yearmon(rownames(df_long)))
Finally, we can use the "Last Observation Carried Forward" function, na.locf
in the xts
package to fill in the dates.
{r}
df_locf <- na.locf(df_xts)
df_locf
CompanyA CompanyB CompanyC
Jan 2001 NA 600 NA
Feb 2001 1000 600 5000
Mar 2001 1000 600 5000
Apr 2001 1000 600 5000
May 2001 1000 600 5000
Jun 2001 1000 600 5000
When calling the plot
function on objects of class xts
, multivariate time series plots are produced easily.
{r}
# The plot function works.
plot(df_locf)
