1

I have some data here [in a .txt file] which I read into a data frame,

mydf <- read.table("data.txt", header=T,sep="\t")

I melt this data frame next using the following piece of code,

df_mlt <-melt(mydf, id=names(mydf)[1], variable = "cols")

Now I would like to plot this data as a boxplot displaying only values of x>0 , so for this I use the following code,

plt_bx <-   ggplot(df_mlt, aes(x=ID1,y=value>0, color=cols))+geom_boxplot()

But the resulting plot looks like the following,

Output of the boxplot

However what I need is to only display positive values of x as individual box plots in the same plot layer. Could someone please suggest what I need to change in the above code to get the proper output, Thanks.

Frank
  • 66,179
  • 8
  • 96
  • 180
Amm
  • 1,749
  • 4
  • 17
  • 27

3 Answers3

2
plt_bx <- ggplot(subset(df_mlt, value > 0), aes(x=ID1,y=value, color=cols)) + geom_boxplot()

You need to subset your data frame to remove the undesirable values. Right now you're plotting value > 0, which is either TRUE or FALSE, instead of the boxplot of only the values that are greater than 0.

BrodieG
  • 51,669
  • 9
  • 93
  • 146
  • Thanks for your suggestion. Using the condition in the subset removes the negative values. I would like to display separate boxplots for each of the x variable ID's – Amm Jan 31 '14 at 17:02
  • 1
    @Amm, try adding `group=ID1` to the `aes`, but without your data is hard to be more specific. If you post the `dput` of a representative section of your data I can give you the exact command. – BrodieG Jan 31 '14 at 17:04
  • I have the data as a .csv file in the link in my question – Amm Jan 31 '14 at 17:05
  • @Amm, I'd prefer a `dput`. My Excel thinks your file is suspect. – BrodieG Jan 31 '14 at 17:07
  • I dont know why your Excel think that data file is suspect. Thanks for your suggestion this seems to work, `plt_bx <- ggplot(subset(df_mlt, value > 0), aes(x=ID1,group=ID1,y=value, color=cols)) + geom_boxplot() +scale_y_log10()` . However the aes color in the boxplot does not appear to be applied though – Amm Jan 31 '14 at 17:11
  • @Amm, try `color=ID1`, unless you want a specific color, in which case it gets more complicated. If this works you also don't need `group`. – BrodieG Jan 31 '14 at 17:17
  • I tried `color=ID1` it displays the boxplot correctly but with one single gradient of colors for the outline of the boxplot. What I wanted to display was different colors though something like variable, but since the variable aes is defined already in the melted data frame line which is `cols` I use this for color such as `color = cols` which in principle should work well I presume? – Amm Jan 31 '14 at 17:26
  • @Amm I'm not sure I understand, do you want each boxplot to have multiple colors, or are all the values of `cols` for any single `ID1` value the same color you want the boxplot to be? – BrodieG Jan 31 '14 at 17:50
  • What I mean is to obtain different colors for the outline of each individual boxplot so calling the aes `color=cols` should work like here http://stackoverflow.com/questions/21335625/getting-the-y-axis-intercept-and-slope-from-a-linear-regression-of-multiple-data .But using group I think overrides the color aes definition – Amm Jan 31 '14 at 17:56
  • 1
    @Amm, as I noted earlier, if you are using `color=cols`, and `cols` doesn't specify an actual color, but just a group, and the groups in `cols` are the same as the groups in `ID1`, then you don't need `group`, you can just use `color=cols`. If the conditions described here are not met, then you can't use `cols` to color the boxplots, at least not through `aes`. – BrodieG Jan 31 '14 at 18:00
  • I understand what you mean, the plot now seems to be much better, thankyou! – Amm Jan 31 '14 at 18:17
1

Based on @BrodieG suggestions, the following piece of code yields a plot as below,

plt_bx <- ggplot(subset(df_mlt, value > 0), aes(x=ID1,y=value,group=ID1)) + 
  geom_boxplot(aes(color=ID1),outlier.colour="orangered", outlier.size=3) +
  scale_y_log10(labels = trans_format("log10", math_format(10^.x))) +
  theme_bw() +
  theme(legend.text=element_text(size=14), legend.title=element_text(size=14))+
  theme(axis.text=element_text(size=26)) +
  theme(axis.title=element_text(size=22,face="bold")) +
  labs(x = "x", y = "y", colour="Values") +
  annotation_logticks(sides = "rl")
plt_bx

enter image description here

Amm
  • 1,749
  • 4
  • 17
  • 27
1

I improved my answer, the outline of the boxplot would display different colors if color in the aes is assigned as a factor of the id from the melted data frame. i.e., geom_boxplot(aes(color=factor(ID1)))

The following code results in a plot as below,

plt <- ggplot(subset(df_mlt, value > 0), aes(x=ID1,y=value)) + 
  geom_boxplot(aes(color=factor(ID1))) +
  scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x), labels = trans_format("log10", math_format(10^.x))) +
  theme_bw() +
  theme(legend.text=element_text(size=14), legend.title=element_text(size=14))+
  theme(axis.text=element_text(size=20)) +
  theme(axis.title=element_text(size=20,face="bold")) +
  labs(x = "x", y = "y",colour="legend" ) +
  annotation_logticks(sides = "rl") +
  theme(panel.grid.minor = element_blank()) +
  guides(title.hjust=0.5) +
  theme(plot.margin=unit(c(0,1,0,0),"mm")) 
plt

Output plot

Amm
  • 1,749
  • 4
  • 17
  • 27