0

I am trying to remove a range of x-axis from a ggplot. My data x represents years and weeks:

202045: year 2020 week 45

202053: last week in 2020 (any year has 52-53 weeks, no more...)

 summary(df$year_week)

Min. 1st Qu. Median Mean 3rd Qu. Max. 202045 202047 202050 202054 202052 202101

Lamentably my data "jump" from last week in 2020 until first week in 2021, and display x-axis with "ghost" weeks, example:

year_week=rep(c(202045,202046,202047,202048,202049,202050,202051,202052,202053,202101),times=1)
cases=rnorm(200, 44, 33)
df=data.frame(year_week, cases)

ggplot(df, aes(x=year_week, y=cases))+
geom_line()+
theme(axis.text.x = element_text(angle = 45,  
    hjust = 0.85, size=9))+
scale_x_continuous(limits=c(202045, 202101))

graph1

I tried to remove with NA, but the results is the same

df$year_week[df$year_week>202053 & df$year_week<202101]= NA
df$cases[df$year_week>202053 & df$year_week<202101]= NA

ggplot(na.omit(df), aes(x=year_week, y=cases))+
geom_line()+
theme(axis.text.x = element_text(angle = 45,  
    hjust = 0.85, size=9))+
scale_x_continuous(limits=c(202045, 202101))

df %>%
filter(!is.na(cases)) %>%
ggplot(aes(x=year_week, y=cases))+
geom_line()+
theme(axis.text.x = element_text(angle = 45,  
    hjust = 0.85, size=9))+
scale_x_continuous(limits=c(202045, 202101))

My expected graph is: (there is not exist week 60 or 80 at any year)

Graph expected

2 Answers2

0

You can make two separate plots, one for pre-2020 and another starting with 2021 and put them next to each other with a small margin using a facet. I think that achieves your goal without potentially confusing your audience with an arbitrary jump in x-axis labels.

Maybe something like this:

df %>% 
  mutate(
    period = case_when(
      year_week < 202101 ~ "Before 2021",
      year_week >= 202101 ~ "After 2021"
    ),
    period = factor(
      period, 
      levels = c("Before 2021", "After 2021"), 
      ordered = T
    )
  ) %>% 
  ggplot() +
  geom_line(
    aes(
      year_week,
      cases
    )
  ) +
  facet_wrap(
    ~period,
    ncol = 2, 
    scales = "free_x"
  )+
  theme(axis.text.x = element_text(angle = 45,  
                                   hjust = 0.85, size=9))

Another issue not directly related to your question is that you are plotting multiple y-values for each value on the x-axis value which results in the unsightly vertical lines connected by harsh diagonal lines using geom_line.

  • Thanks, One year have 53 week no more. Impossible confusing my audience with an arbitrary jump in x-axis labels. Sorry maybe my y data was not the best... – Rodrigo Badilla Jan 10 '21 at 18:24
  • Yes, you have 20 values for each week, which does not work well with geom_line. If you like my answer solving your x-axis issue, go ahead and give it a green checkmark. – Kris Williams Jan 10 '21 at 18:28
  • i will wait for other better solutions, thanks for your time. – Rodrigo Badilla Jan 10 '21 at 18:30
0

The issue is that your year_week variable is a numeric. However, as the weeks stop at 52 (or 53), e.g. 202052 you get a gap of 48 = 202101 - 202052 - 1 weeks before the first week of the next year starts. You could prevent that by converting your year_week variable to a character using as.character. Or you could do some formatting, e.g. split the year and week and add a hyphen, space, ... in between like I do in my code:

Note: When converting to a character you have to make use of the group aes.

year_week=rep(c(202045,202046,202047,202048,202049,202050,202051,202052,202053,202101),times=1)
cases=rnorm(200, 44, 33)
df=data.frame(year_week, cases)

df$year_week <- paste(substr(df$year_week, 1, 4), substr(df$year_week, 5, 6), sep = "-")

library(ggplot2)
ggplot(df, aes(x=year_week, y=cases, group = 1))+
  geom_line()+
  theme(axis.text.x = element_text(angle = 45,  
                                   hjust = 0.85, size=9))

stefan
  • 90,330
  • 6
  • 25
  • 51