-2

I am new to R so please forgive me. I have a tibble called ycd with columns that I want to plot against time. I have a column in the tibble that has corresponding dates. Some columns have NA's up to a certain date because data is not available. I dont want to na.fill it with zeros, I just want the line for the column to start when the data is present and remain empty when it is not. Here is my call on str(ycd):

> str(ycd)
spec_tbl_df [7,808 x 13] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ Date : chr [1:7808] "1/2/1990" "1/3/1990" "1/4/1990" "1/5/1990" ...
 $ 1 Mo : chr [1:7808] "N/A" "N/A" "N/A" "N/A" ...
 $ 2 Mo : chr [1:7808] "N/A" "N/A" "N/A" "N/A" ...
 $ 3 Mo : num [1:7808] 7.83 7.89 7.84 7.79 7.79 7.8 7.75 7.8 7.74 7.89 ...
 $ 6 Mo : num [1:7808] 7.89 7.94 7.9 7.85 7.88 7.82 7.78 7.8 7.81 7.99 ...
 $ 1 Yr : num [1:7808] 7.81 7.85 7.82 7.79 7.81 7.78 7.77 7.77 7.76 7.92 ...
 $ 2 Yr : num [1:7808] 7.87 7.94 7.92 7.9 7.9 7.91 7.91 7.91 7.93 8.1 ...
 $ 3 Yr : num [1:7808] 7.9 7.96 7.93 7.94 7.95 7.94 7.95 7.95 7.98 8.13 ...
 $ 5 Yr : num [1:7808] 7.87 7.92 7.91 7.92 7.92 7.92 7.92 7.94 7.99 8.11 ...
 $ 7 Yr : num [1:7808] 7.98 8.04 8.02 8.03 8.05 8.05 8 8.01 8.07 8.18 ...
 $ 10 Yr: num [1:7808] 7.94 7.99 7.98 7.99 8.02 8.02 8.03 8.04 8.1 8.2 ...
 $ 20 Yr: chr [1:7808] "N/A" "N/A" "N/A" "N/A" ...
 $ 30 Yr: num [1:7808] 8 8.04 8.04 8.06 8.09 8.1 8.11 8.11 8.17 8.25 ...
 - attr(*, "problems")= tibble [1,006 x 5] (S3: tbl_df/tbl/data.frame)
  ..$ row     : int [1:1006] 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 ...
  ..$ col     : chr [1:1006] "30 Yr" "30 Yr" "30 Yr" "30 Yr" ...
  ..$ expected: chr [1:1006] "a double" "a double" "a double" "a double" ...
  ..$ actual  : chr [1:1006] "N/A" "N/A" "N/A" "N/A" ...
  ..$ file    : chr [1:1006] "'YeildCurve.csv'" "'YeildCurve.csv'" "'YeildCurve.csv'" "'YeildCurve.csv'" ...
 - attr(*, "spec")=
  .. cols(
  ..   Date = col_character(),
  ..   `1 Mo` = col_character(),
  ..   `2 Mo` = col_character(),
  ..   `3 Mo` = col_double(),
  ..   `6 Mo` = col_double(),
  ..   `1 Yr` = col_double(),
  ..   `2 Yr` = col_double(),
  ..   `3 Yr` = col_double(),
  ..   `5 Yr` = col_double(),
  ..   `7 Yr` = col_double(),
  ..   `10 Yr` = col_double(),
  ..   `20 Yr` = col_character(),
  ..   `30 Yr` = col_double()
  .. )

I tried to do the following but it didn't not work.


     ggplot(data=ycd) + geom_point(aes(x=ycd$Date,y=ycd$`1 Mo`,size=10)) + 
      geom_point(aes(x=ycd$Date,y=ycd$`2 Mo`,size=10)) + 
      geom_point(aes(x=ycd$Date,y=ycd$`3 Mo`,size=10)) + 
      geom_point(aes(x=ycd$Date,y=ycd$`6 Mo`,size=10)) + 
      geom_point(aes(x=ycd$Date,y=ycd$`1 Yr`,size=10)) + 
      geom_point(aes(x=ycd$Date,y=ycd$`2 Yr`,size=10)) + 
      geom_point(aes(x=ycd$Date,y=ycd$`3 Yr`,size=10)) + 
      geom_point(aes(x=ycd$Date,y=ycd$`5 Yr`,size=10)) + 
      geom_point(aes(x=ycd$Date,y=ycd$`7 Yr`,size=10)) + 
      geom_point(aes(x=ycd$Date,y=ycd$`10 Yr`,size=10)) + 
      geom_point(aes(x=ycd$Date,y=ycd$`20 Yr`,size=10)) + 
      geom_point(aes(x=ycd$Date,y=ycd$`30 Yr`,size=10)) + 
      geom_smooth(method = "lm", se=FALSE, color="black")

Any advice? I was thinking about just converting it to a data.frame but I find it an easy way out. I really want to learn tibble!

The data is the table found at https://www.treasury.gov/resource-center/data-chart-center/interest-rates/Pages/TextView.aspx?data=yield copy and pasted into an excel document and read using read_csv()

AaronSzcz
  • 145
  • 1
  • 8
  • 1
    Please post your data in a format we can run code with, paste the output from `dput(ycd)`. – Ricardo Semião e Castro Mar 21 '21 at 22:45
  • 1
    Hye Ricardo, unfortunately, my data is much too large. However, I did include the link. All I did was copy and paste that data into an excel document and call read_csv. I apologize if this question seems as if it puts in little effort. I assure you that it has not for me. – AaronSzcz Mar 21 '21 at 22:54
  • 1
    As a tip, ggplot works much easier with "long" data vs. "wide" data. The typical approach for this problem would be to reshape your data (using `tidyr::pivot_longer` or many alternatives) and then feed that into ggplot like `ggplot(ycd_long, aes(Date, value, color = name)) + geom_point(size = 10) + geom_smooth(aes(group = 1), se = FALSE, color = "black")`. Much shorter for same result. – Jon Spring Mar 22 '21 at 00:42

2 Answers2

1

Using DF shown reproducibly in the Note at the end run the following. Omit facet=NULL if you want separate panels.

library(zoo)
library(ggplot2)

z <- read.zoo(DF, format = "%m/%d/%Y")
autoplot(z, facet = NULL) + ggtitle("My Series")

Note

DF <- data.frame(Date = c("1/2/1990", "1/3/1990", "1/4/1990", "1/5/1990"),
 a = c(NA, 1:3), b = c(3:1, NA))
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
0

As @Jon Spring pointed out, ggplot works better with "long" data, so we apply the tidyr::pivot_longer function:

ycd = tidyr::pivot_longer(ycd, -1, values_to="InterestRates", names_to="Duration")

Where the -1 makes so that the first column (the date) isn't transformed.

If i understood correctly, you want to draw points for every group, but only one geom_smooth for the whole data, is that correct? if it is, specifying that the geom_smooth color is black overrides the color=Duration, but if you want to have one geom_smooth per Duration, then just remove color="black".

ggplot(ycd, aes(x=Date, y=InterestRates, color=Duration)) +
  geom_point(size=10) +
  geom_smooth(method="lm", se=FALSE, color="black")

You can also use different aesthetics (arguments you put inside aes()) for differentiating the points, for example shape=....