-1

the graph that I currently get

Hello, I am very new to R and wanted to demonstrate a possible interaction between two variables in a line graph. However, the graph that I get does include all the individual reaction times values rather than the means. I guess my data might be in the wrong format? Currently, my position and tense conditions are specified in a different column each and the reaction time as outcome variable in yet another column.

The code that I used was:

line <- ggplot(data_tense_final, aes(f2.f.position, RT3, colour = f2.f.tense))

line +
  stat_summary(fun.y = mean, geom = "point") + 
  stat_summary(fun.y = mean, geom = "line", aes(group = f2.f.tense)) + 
  stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2) + 
  labs(x = "Position", y = "Mean RT", colour = "f2.f.tense")

The dataframe looks more or less like this:

  f2.f.participant f2.f.condition f2.f.tense f2.f.position              RT3
1                 1              1       past          back 445.944444444444
2                 1              2     future         front 448.882352941176
3                 1              3       past         front 454.222222222222
4                 1              4     future          back         526.4375
5                 2              1       past          back 338.631578947368
6                 2              2     future         front 342.058823529412
7                 2              3       past         front 350.222222222222
8                 2              4     future          back 341.266666666667
9                 3              1       past          back              331
10                3              2     future         front 325.647058823529


The output from deput(x) is:

structure(list(f2.f.position = c("back", "front", "front", "back", 
"back", "front", "front", "back", "back", "front", "front", "back", 
"back", "front", "front", "back", "back", "front", "front", "back"
), RT3 = c("445.944444444444", "448.882352941176", "454.222222222222", 
"526.4375", "338.631578947368", "342.058823529412", "350.222222222222", 
"341.266666666667", "331", "325.647058823529", "303.9375", "361.111111111111", 
"304.722222222222", "288.647058823529", "281.823529411765", "309.944444444444", 
"304.722222222222", "288.647058823529", "281.823529411765", "309.944444444444"
), f2.f.tense = c("past", "future", "past", "future", "past", 
"future", "past", "future", "past", "future", "past", "future", 
"past", "future", "past", "future", "past", "future", "past", 
"future")), row.names = c(1L, 20L, 39L, 58L, 77L, 96L, 115L, 
134L, 153L, 172L, 191L, 210L, 229L, 248L, 267L, 286L, 305L, 324L, 
343L, 362L), class = "data.frame")


I've probably made a very obvious mistake, apologies in advance! Many thanks!

Julia
  • 1
  • 1
  • Welcome to SO, JuliaH! Is there a way you can provide sample data? It's hard to "play" with the code without knowing what we're looking at. If you can share real data, then please post the output from `dput(x)` where `x` is the top so-many-rows of your frame, including just the columns you're using, so perhaps `x` is `head(data_tense_final[,c("f2.f.position", "RT3", "f2.f.tense")], 20)`; similar for `mean_cl_boot`. If you cannot share the data (understandable), that does not excuse no data ... it just means you need to create random/representative data or use a public dataset. Thank you! – r2evans Feb 23 '21 at 13:50
  • Thank you very much and absolutely, I should have added some information about the structure of the dataframe (have added a few rows above now), especially as I suspect that the problem might be the data format! Many thanks and apologies! – Julia Feb 23 '21 at 14:44
  • Please read my previous comment. A `data.frame` on R's console can be a misrepresentation of the data: numbers or strings can actually be `factor`, which breaks many things. Again, please post the output from `dput`, it can clarify a lot. – r2evans Feb 23 '21 at 14:50
  • Of course, I am sorry (as I said, I am new to R and wasn't sure which information would be most helpful but yes, you said it in your previous comment, apologies!), I will add the output to the question above. I have written and applied a few functions to the original dataframe (to remove outliers, calculate mean scores, etc) to create this dataframe, maybe this is a problem? Many thanks! – Julia Feb 23 '21 at 15:25
  • Your numbers are not numbers, they're strings, which is certainly a problem (and *that's* why I suggested `dput` :-). Fix your import process (`read.csv` or similar), it's currently broken. As an interim, you can use `as.numeric(.)`, but it's best to fix it at the source of the problem. – r2evans Feb 23 '21 at 16:12
  • Thank you ever so much, it must have been because of my outlier removal procedure etc, I exported the data into a new csv file, re-imported and it all worked fine with the further analysis! Thank you so much! :-) – Julia Feb 23 '21 at 16:25
  • While I'm glad that worked for you, I cannot help but think that that is not the best way to solve the problem. If this is a one-off and you will never read in that data again, then so be it, but I really think a better thing is to learn what broke the `numeric` class in your outlier detection and fix that problem after importing your data *once*. Export-and-reimport (from a project-management perspective) introduces more changes for things to go wrong. Just my two cents, glad you have it working now! – r2evans Feb 23 '21 at 17:22
  • 1
    You are right, I will explore in more detail at which point of my outlier detection this happened, as I will continue to work with this data set-- but I am very happy already that it is kind of working now already, thanks so much! :-) – Julia Feb 24 '21 at 15:56

2 Answers2

0

Try to use the function ggline(..., add = "mean"). You can find more info here

Gian Lima
  • 1
  • 1
0
line <- ggplot(data_tense_final, aes(f2.f.position, RT3, colour = f2.f.tense))

line +
  stat_summary(fun = "mean", geom = "point") + # I guess "s are needed
  stat_summary(fun = "mean", geom = "line") + # grouping is already done by colour
  stat_summary(fun.data = "mean_cl_boot", geom = "errorbar", width = 0.2) + 
  labs(x = "Position", y = "Mean RT", colour = "f2.f.tense")
Clem Snide
  • 483
  • 6
  • 13