1

I have results from a survey. I am trying to create a graphic displaying the relationship of two variables: "Q1" and "Q9.1". "Q1" is the independent and "Q9.1" is the dependent. Both variables have responses from like scale questions: -2,-1,0,1,2. A typical plot places the answers on top of each other - not very interesting or informative. I was thinking that hexbin would be the way to go. The data is in lpp. I have not been able to use "Q1" and "Q9.1" for x and y. However:

> is.numeric("Q1")
[1] FALSE
q1.num <- as.numeric("Q1")
Warning message:
NAs introduced by coercion 

The values for Q1 are (hundreds of instances of): -2,-1,0,1,2

How can I make a hexbin graph with this data? Is there another graph I should consider?

Error messages so far:

Warning messages:
1: In xy.coords(x, y, xl, yl) : NAs introduced by coercion
2: In xy.coords(x, y, xl, yl) : NAs introduced by coercion
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf
5: In min(x) : no non-missing arguments to min; returning Inf
6: In max(x) : no non-missing arguments to max; returning -Inf
d-cubed
  • 1,034
  • 5
  • 30
  • 58
  • It might help to provide some example data so we can see what structure it currently has. A helpful function for this is dput(...) which outputs a description that can be used to recreate an object. – PaulHurleyuk Oct 24 '10 at 20:41
  • @Donnied; You need to sort out your data first. Something isn't right here; you are introducing NAs when you coerce to numeric, during plotting, `xy.coords()` is creating NAs such that you have no non-NA data. Take a look at the output of `str(Q1)` etc for all your data - are they stored as numerics? Finally, your first two lines of R are wrong; you don't refer to an object by it's **quoted** name. If you wan to see if `Q1` is numeric you do `is.numeric(Q1)`. What you have done is ask if the string `"Q1"` is numeric, which inevitably is FALSE. You didn't do this in the `plot()` call did you? – Gavin Simpson Oct 25 '10 at 07:53
  • I apologize. I've just started using R. I have a csv file which I read in as data. "Q1" is one of the column headers / variables. – d-cubed Oct 25 '10 at 12:24
  • Something such as this: d <- ggplot(lpp, aes(Q1, Q3a.8)) d + stat_binhex(bins = 11) works. Otherwise I'm having difficulty reading Q1 in as a numeric or converting to numeric. It seems that I've managed to simply store the characters Q1 in a numeric variable not the data. The data set 'lpp' – d-cubed Oct 25 '10 at 12:28
  • is recognized as an object. However, I am not sure how to refer to the columns explicitly or use them as objects.I'm reading up on R but there are gaping deficits in what I know and need to know. – d-cubed Oct 25 '10 at 12:31
  • I'm thinking I need to reimport specifying with colClasses. – d-cubed Oct 25 '10 at 12:33
  • I'm having better luck if I attach the data and the specify the 2 columns of interest in a new dataframe. – d-cubed Oct 30 '10 at 19:34

2 Answers2

3

How about taking a slightly different approach? How about thinking of your responses as factors rather than numbers? You could use something like this, then, to get a potentially useful representation of your data:

# Simulate data for testing purposes
q1 = sample(c(-2,-1,0,1,2),100,replace=TRUE)
q9 = sample(c(-2,-1,0,1,2),100,replace=TRUE)
dat = data.frame(q1=factor(q1),q9=factor(q9))
library(ggplot2)
# generate stacked barchart
ggplot(dat,aes(q1,fill=q9)) + geom_bar()

You may want to switch q1 and q9 above, depending on the view of the data that you want.

seandavi
  • 2,818
  • 4
  • 25
  • 52
2

Perhaps ggplot2's stat_binhex could sort that one for you?

Also, I find scale_alpha useful for dealing with overplotting.

radek
  • 7,240
  • 8
  • 58
  • 83
  • I really like the stat_binhex. I can't find how to add title though. labs, xl, and yl don't work. – d-cubed Oct 24 '10 at 21:03
  • to label x axis you could try: qplot(x, y, data = data, xlab = "my label") or: ggplot(data, aes(x, y)) + geom_point() + scale_x_continuous("my label") – radek Oct 24 '10 at 21:37
  • 1
    The `ylab()` and `xlab()` functions also add axis labels, e.g. `ggplot(data, aes(x, y)) + geom_point() + ylab("my label")` – Gavin Simpson Oct 25 '10 at 12:09