0

So, the issue is as follows: I have a dataset which contains

  1. A Condition factor variable with (for this example) 3 levels that need to be plotted on a y axis,
  2. A Group factor variable with three levels to be plotted on the x, and
  3. A value for each group at every condition (example data below).

The three levels on the x axis indicate conditions and I would like to display observations at each level on the y in a violin plot format. I am aware of the fact that I need a numeric on the y axis for ggplot to plot these data, but cannot find a solution to solve this issue of nesting specific values (which will change from experiment to experiment) for the y value at each x condition. My progress (after receiving prior help here) has been properly formatting the data into a data frame, and melting the data into a long format for ggplot.

Example data below:

Condition  Observation  Value

1-----------------A-----------11

1-----------------B-----------7

1-----------------C-----------2

2-----------------A-----------21

2-----------------B-----------2

2-----------------C-----------5

3-----------------A-----------16

3-----------------B-----------45

3-----------------C-----------34

EDIT:

> SampleA <- c(3,7,9)
> SampleB <- c(15,23,33)
> SampleC <- c(21,19,12)
> Observations <- c("Observation 1", "Observation 2", "Observation 3")
> df0 <- data.frame(Observations = as.factor(Observations), SampleA, SampleB, SampleC)
>library(ggplot2)
>df0 <- reshape2::melt((df0, id.vars = "Observations"))

  • 1
    The answer to the question in your title is “yes, using `as.character`” but I can’t figure out how that relates to your question body, and unfortunately from your description I don’t understand how you actually want to plot the data. — Unrelatedly, it would be nice if you could paste the example data here in a *usable format*, ideally via `dput`. – Konrad Rudolph Apr 16 '20 at 17:33
  • 1
    It's not that violin plots need a `numeric` so much as violin plots need something relatively *continuous*. Why not use a chart type that works well with discrete data on both axes. Maybe a heatmap? `ggplot(your_data, aes(x = Observation, y = Condition, fill = Value)) + geom_tile()` – Gregor Thomas Apr 16 '20 at 17:34
  • Agree with [u/Gregor Thomas](https://stackoverflow.com/users/903061/gregor-thomas). You have data which is ordinal (meaning it has an order to it), but violin plots need data which is continuous. Does it make sense that you can have something "about 30% between Condition A and B" in the above example? If not, then you cannot force this variable to "be" continuous and I think `geom_tile()` is an excellent option here to explore. – chemdork123 Apr 16 '20 at 18:00
  • Thank you all- In an attempt to clarify further I've constructed a very small sample dataframe above which gets you to what I'm working with. I think the heatmap option would be a great visualization tool- and one that I am going to explore. The issue that I have is that I'm wondering if there is anyway to show the relative density of each observation for comparison between groups. Just as a thought exercise, think of each observation as though you're counting the number of times an event has occurred at each variable and want to display the density of each read. – user13305413 Apr 16 '20 at 18:37
  • You can normalize by observation (divide by the total number within that observation) and do a heatmap of that. – Gregor Thomas Apr 16 '20 at 19:34

1 Answers1

1

I'd suggest something like this:

library(dplyr)
df0 = df0 %>%
  group_by(Observations) %>%
  mutate(norm_value = value / sum(value))

ggplot(df0, aes(x = Observations, y = variable, fill = norm_value)) + 
  geom_tile() +
  geom_label(aes(label = scales::percent(norm_value)), fill = "gray80") +
  guides(fill = F) +
  coord_equal() +
  labs(x = "", y = "") +
  theme_minimal()

enter image description here

If you have a lot of data, I'd remove the individual labels and rely on the color scale, but with this few points direct labels seem clearest.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294