0

I have been conducting some exercises from OpenIntro statistics to start getting familiar with R and RStudio.

I have completed all the exercises, I run my code in R studio and I get all of the tables and graphs that I have generated without a problem inside RStudio.

However, when it is time to knit the data, I get an error (that I believe I should not be getting given that I was able to run my code in RStudio without any errors and my tables and graphs are generated accurately).

The knitting bugs at exercise 3 where I am told to generate a plot of the proportion of boys that were born over time. Here is a sample of my code (lines 53 to 58)

```{r plot-prop-boys-arbuthnot}
mutate (arbuthnot, boy_ratio = boys / total)

ggplot(data = arbuthnot, aes(x = year, y = boy_ratio)) + 
  geom_line()
```

However, then I get a big error message that I do not understand. It says that total was not found. I tried defining the total by inserting :

total <- boys + girls

or by inserting :

total <- arbuthnot$boys + arbuthnot$girls

It just does not seem to work no matter what I do. For instance, even if I successfully define the total, it will bug again and give me another error when I need to knit the lab report. Sometimes I switched the way I write the mutate code. For instance, I also used

   arbuthnot <- arbuthnot %>%
      mutate(boy_ratio = boys / total)

However, even when I use this code in combination with the solutions I tried for defining the total, it still does not work.

enter image description here

I am not sure what to do at this point because the graph is displayed in RStudio. The ratio is accurate, it also shows up in a table that I have generated.

The variable total is in that table. I tried re-starting and re-running all the chunks of code in R. All of my tables and graphs come out perfectly and then when I try to knit my lab report again it bugs at line 54.

I have been trying to solve this for 2 days now and I am not sure what I should do.

I hope the community here will be able to give me a couple of pointers on how to solve this problem :) ! If you need more information or a bit more code let me know :) !

Wishing everyone a wonderful day !

Phil
  • 7,287
  • 3
  • 36
  • 66
Grifindor
  • 1
  • 1

2 Answers2

0

To help others help you, consider making a minimal working example (MWE), for example using the reprex package. Without more details, it is near impossible to know exactly what when wrong.

The error message states that there is no total in the environment and that arbuthnot does not contain a column total, so possibly the latter was created but not assigned. It may be that the variable is in your environment when you run the code interactively and created the column or the variable at some point (using the code you provided). However, note that the script compiles in a new environment from scratch when knitting the .Rmd file, in which case it cannot find the variable and aborts.

To debug your code, consider replacing the code chunk lines 53-58 by a print statement, like head(arbuthnot), to see what comes out in the output file and confirm that the tibble indeed contains total.

Alternatively, debug by running the code chunk by chunk until you get the error message in a new environment. In RStudio, try Ctrl + Shift + F10 (equivalent to Session > Restart R) to clear everything and start afresh.

The following code chunk should work

library(openintro)
library(tidyverse) 
data(arbuthnot)

arbuthnot <- arbuthnot %>%  # note assignment (write over database)
  mutate(total = boys + girls, # define total first
         boy_ratio = boys / total) 
ggplot(data = arbuthnot, 
       mapping = aes(x = year, y = boy_ratio)) + 
  geom_line()
lbelzile
  • 175
  • 8
0

Thank you @lbelzile for the great tips.

In the future, I will use the minimal working example to better inform other contributors on stack overflow. I thought that the evidence I had provided was sufficient.

That being said, thank to the bits of code you sent me, I was able to solve the problem.

Following parts of your instructions, here is the code that worked :

head(arbuthnot)

library(tidyverse)
library(openintro)
data(arbuthnot)

arbuthnot <-arbuthnot %>% 
mutate (total = boys + girls, boy_ratio = boys / total)

ggplot(data = arbuthnot, aes(x = year, y = boy_ratio)) + 
  geom_line()

After inserting this code, the file was able to get stitched and my lab report was generated.

I would like to thank you for taking the time to help me :) !

Have a great week.

Grifindor
  • 1
  • 1