I am new to R. I am working with a dataseries and want to plot multiple lines on a series where x=midpoint using ggplot. If I am working with data that contains NA values, I understand I should use na.rm=T
. I know this kind of data is really annoying to work with, but I am curious to know if there are any quick fixes to the issues I'm having.
My dataset contains:
- a
midpoint
column (x) - columns for
z_chlA, z_d13C,z_d15N
(y)
NOTE: the z_chlA
column has no NA values, but z_d15N
and z_d13C
have many
The code I am working with:
midpoint <- proxydata$midpoint
z_chlA <- as.numeric(proxydata$z_chlA)
z_dN15 <- as.numeric(proxydata$z_d15N, na.rm=TRUE)
z_d13C <- as.numeric(proxydata$z_d13C.corrected, na.rm=TRUE)
ggplot(data=proxydata, x=midpoint, group=names) +
geom_line(aes(x=midpoint, y=z_chlA)) +
geom_line(aes(x=midpoint, y=as.numeric(z_d15N))") +
geom_line(aes(x=midpoint, y=z_d13C))+
scale_x_continuous(breaks=seq(0, 30, by= 2)) + ylim(c(-2, 3))
when I run it without ** na.rm=TRUE ** in the geom_line functions I get the warning message:
Warning messages: 1: Removed 49 row(s) containing missing values (geom_path). 2: Removed 53 row(s) containing missing values (geom_path). 3: Removed 19 row(s) containing missing values (geom_path).
The code I am running makes the plot but does not plot any of the groups that have NA values in them, the line is missing in the final plot.
I tried to run it with na.rm=TRUE
in the first part when I define the variables z_chlA, z_d15N, z_d13C
, and I also tried adding it to the geom_line function which did not work even though it removed the warning messages. You can see I also tried using as.numeric()
, which did not help. It seems like it is not plotting any of the groups that have NA values in them because my data in the chlA column has no NA's and is plotting fine. When I use geom_point in place of geom_line, it it plots all the variables appropriately but I would like lines not points. If I use geom_point + geom_line
it still does not plot the line. I tried playing around with groups=names
in the ggplot() function but this did not work for me, I am still doing something wrong.
I can just work with a dataset that doesn't have NA values, but I am curious to know if there is a way around this otherwise.