0

I have data (df) containing data where each row is a different school. Each school has its own ID number, the number of students who failed maths gcse, the number who passed, and the number who sat the gcse.

E.g. for school

School ID fail pass total urban %FSM
1 12 43 55 N 23

I want to do two things with this data:

  1. Calculate if urban and rural schools have significantly different pass rates
  2. A beta regression: pass rate ~ urban + %FSM

I believe to do this I need to effectively turn this school-level data into pupil level data. So now there are 55 rows with ID1, and a new pass column in which 12 of the rows say "N" and 43 say "Y".

How could I use R to achieve this new dataset? I have about 3200 rows (i.e. unique school IDs at the moment so need code that will do this automatically for all schools).

Mark
  • 7,785
  • 2
  • 14
  • 34
Jess
  • 11
  • 2

1 Answers1

0

You could use uncount:

library(tidyverse)
df %>%
  pivot_longer(cols = c(pass, fail)) %>%
  uncount(value)
deschen
  • 10,012
  • 3
  • 27
  • 50