-1

So I'm having trouble creating a dot plot/bar graph of this data set I have. My data set looks like this. I want an output that looks like this. However, geom_bar() through ggplot will only give me counts, and won't take the individual decimal values from the table. I've tried using Plotly as well, but it doesn't seem to scale well to plots with multiple players.

I've already set up a larger data frame with 200+ variables. I'm trying to make something that can search for specific players in that data frame, and then create a plot from it. Consequently, I'm ideally looking for something that can easily handle 5-10 different series.

Any help would be greatly appreciated.

Thanks!

sdr1975
  • 25
  • 6

1 Answers1

0

This is pretty straightforward, the key is to get your data from its current wide format into the long format that is more useful for plotting in R. And use geom_point rather than geom_bar.

First, some reproducible example data (that you should use again in your question if you post another question here, makes it much easier for others to help you):

library(ggplot2)
library(reshape2)

dataset <- data.frame(
  PlayerName = letters[1:6], 
  IsolationPossG = runif(6), 
  HandoffPossG = runif(6), 
  OffScreenPossG = runif(6)
)

This is your current data, in the wide format:

 dataset
  PlayerName IsolationPossG HandoffPossG OffScreenPossG
1          a     0.78184751  0.939183520     0.74461784
2          b     0.06557433  0.745699149     0.96540299
3          c     0.21105745  0.753534811     0.02977973
4          d     0.41271918  0.555475622     0.18317886
5          e     0.38153149  0.246292074     0.74862310
6          f     0.89946318  0.008412111     0.53195933

Now we convert to the long format:

molten <- melt(
  dataset, 
  id.vars = "PlayerName", 
  measure.vars = c("IsolationPossG", "HandoffPossG", "OffScreenPossG")
)

Here is the long format, much more useful for plotting in R:

head(molten)
  PlayerName       variable      value
1          a IsolationPossG 0.78184751
2          b IsolationPossG 0.06557433
3          c IsolationPossG 0.21105745
4          d IsolationPossG 0.41271918
5          e IsolationPossG 0.38153149
6          f IsolationPossG 0.89946318

Here's how to plot it:

ggplot(molten, aes(x = variable, y = value, colour = PlayerName)) +
  geom_point(size = 4) +
  theme_bw() +
  theme(legend.position="bottom",legend.direction="horizontal")

Which gives:

enter image description here

h/t how to have multple labels in ggplot2 for bubble plot

If you want the shape of the data point to vary by name, as your example image shows (but it seems rather excessive to have the player name variable on two of the plot's aesthetics):

ggplot(molten, aes(x = variable, y = value, shape = PlayerName, colour = PlayerName)) +
  geom_point(size = 4) +
  theme_bw() +
  theme(legend.position="bottom",legend.direction="horizontal")

enter image description here

Community
  • 1
  • 1
Ben
  • 41,615
  • 18
  • 132
  • 227
  • Thanks so much this is perfect! The idea of turning each row into a unique value crossed my mind, but I didn't realize how easy it would be with something like melt(). Also, how'd you format your data frames when commenting on StackoverFlow? I tried using Markdown but that didn't work. – sdr1975 Feb 02 '16 at 00:44
  • You're welcome, the wide-to-long conversion is a fundamental part of preparing many datasets for analysis and plotting, so it's a useful skill to know. To format the data frames in my answer I selected the text and use the code formatting button, see here for more details: http://meta.stackexchange.com/a/22189/181565 – Ben Feb 03 '16 at 05:48