0

I'd like to make a graph of panel data. I included the paneldata with a dput(). The issue I'm running into is as follows. I'd like each single date be displayed on the x-axis with a ggplot instead of estimates as it regularly does. Hence I use:

    ggplot(data = industry_risk_exposure_covid, aes(x=Quarter, y=Risk_exposure)) +
      geom_line(aes(colour=industry)) + 
      scale_x_continuous(labels = as.character(Quarter), breaks=Quarter) + 

However obviously it runs into problems with the x-axis as the x-axis is panel data hence repeating its values. How do I solve this problem?

    structure(list(Quarter = c(2019.3, 2019.4, 2020.1, 2020.2, 2020.3, 
    2020.4, 2021.1, 2019.3, 2019.4, 2020.1, 2020.2, 2020.3, 2020.4, 
    2021.1, 2019.3, 2019.4, 2020.1, 2020.2, 2020.3, 2020.4, 2021.1
    ), Risk_exposure = c(0.931366310178586, 0.790048218420605, 0.926209779967134, 
    0.948074080058149, 0.977557012231547, 0.798212439797712, 1.37986128229538, 
    0.643837908999298, 0.589151809903988, 0.560347370890284, 0.610139698052225, 
    0.594840529383872, 0.625698251450734, 0.647346199698159, 0.672295261900964, 
    0.661018891645603, 1.12339028625562, 0.882948576808631, 0.706404636299307, 
    0.929349206317779, 0.574016070848228), industry = c("Retail", 
    "Retail", "Retail", "Retail", "Retail", "Retail", "Retail", "Consumer services", 
    "Consumer services", "Consumer services", "Consumer services", 
    "Consumer services", "Consumer services", "Consumer services", 
    "Food beverage Tabacco", "Food beverage Tabacco", "Food beverage Tabacco", 
    "Food beverage Tabacco", "Food beverage Tabacco", "Food beverage Tabacco", 
    "Food beverage Tabacco")), class = "data.frame", row.names = c(NA, 
    -21L))
Phil
  • 7,287
  • 3
  • 36
  • 66
  • Change your Quarter variable to a factor: `yourdata$Quarter <- forcats::fct_inorder(as.character(yourdata$Quarter))` and then run your ggplot2 code without the `scale_x_continuous()` – Phil Dec 18 '22 at 14:08
  • (1) It appears you have an object `Quarter` in your environment, since the two references to it in `scale_x_continuous` do not use NSE to find it in the data. While often this is not a big problem, it can be if your `industry*` object has been updated since you used `Quarter` to create it. (2) *"Each individual date"*, can you explain what you mean here? Your data does not have any dates, so do you mean that (say) `"2020.1"` should be shown as `"2020-01-01"`? – r2evans Dec 18 '22 at 15:05
  • @Phil It works for the x_axis however sadly the graph isn't displayed anymore. That is the graph is empty. – bernard202210 Dec 18 '22 at 19:05
  • @r2evans I mean on the x-axis just the quarterly data as in the dataframe, 2019.3, 2020.1, 2021.2 etc.. – bernard202210 Dec 18 '22 at 19:13
  • ''' ggplot(data = industry_risk_exposure_covid, aes(x=Quarter, y=Risk_exposure)) + geom_line(aes(colour=industry)) + scale_color_manual(values = c("darkred", "steelblue", "green")) + geom_vline(xintercept = 2020.1, linetype="dashed", color = "red") + theme(axis.text.x = element_text(angle = 30, vjust = 0.5)) ''' Is the code for the ggplot I used. – bernard202210 Dec 18 '22 at 19:15
  • 1
    Long code does not do well in comments for many reasons, and your question should always contain the complete and up-to-date code. Please [edit] your question to add that code there. – r2evans Dec 18 '22 at 19:16

1 Answers1

0

Here are two options:

  1. Change Quarter to be character. This requires some preprocessing as well as adding group= to the geom_line.

    transform(industry_risk_exposure_covid, Quarter = as.character(Quarter)) |>
      ggplot(aes(x=Quarter, y=Risk_exposure)) +
      geom_line(aes(colour=industry, group=industry)) +
      scale_color_manual(values = c("darkred", "steelblue", "green")) +
      geom_vline(xintercept = 2020.1, linetype="dashed", color = "red") +
      theme(axis.text.x = element_text(angle = 30, vjust = 0.5))
    

    ggplot2 with x-axis changed to strings

    The ordering of categorical axis labels is controlled by either lexicographic (alphabetic) ordering of the strings, or by the levels within factors. We have strings, and fortunately the strings naturally sort intuitively, so we don't need to mess with factors (though they would work equally well).

  2. Extension: break out year and quarter:

    transform(industry_risk_exposure_covid, Quarter = as.character(Quarter)) |>
      ggplot(aes(x=Quarter, y=Risk_exposure)) +
      geom_line(aes(colour=industry, group=industry)) +
      scale_color_manual(values = c("darkred", "steelblue", "green")) +
      geom_vline(xintercept = 2020.1, linetype="dashed", color = "red") +
      # theme(axis.text.x = element_text(angle = 30, vjust = 0.5)) +
      scale_x_discrete(labels = function(z) ifelse(grepl("1$", z), sub("\\.", "\n", z), sub(".*\\.", "\n", z)))
    

    ggplot2 with multi-level labels

    Note that I commented out the angle for the labels; that's because I inferred the angling was due to the number/spacing of them. It's not required, it's not a problem either way.

    The ordering of them is still safe, since we don't actually modify the Quarter values until we generate labels, at which point their order is pre-determined.

    There is the option to left-align them (so that the "2" of "2020" is over the quarter number, instead of "2020" centered) by using

    + theme(axis.text.x = element_text(hjust = 0.1))
    
r2evans
  • 141,215
  • 6
  • 77
  • 149