I'm sorry for asking this here but there is no discussion page for this course on the website and it mentions stackoverflow to ask any questions. This is from this edx course.
Q1: Using the following dataset:
'''
url <- "https://raw.githubusercontent.com/genomicsclass/dagdata/master/inst/extdata/babies.txt"
filename <- basename(url)
download(url, destfile=filename)
babies <- read.table("babies.txt", header=TRUE)
'''
splitting into two groups (non-smoking and smoking):
bwt.nonsmoke <- filter(babies, smoke==0) %>% select(bwt) %>% unlist
bwt.smoke <- filter(babies, smoke==1) %>% select(bwt) %>% unlist
Set the seed at 1 and obtain a samples from the non-smoking mothers (dat.ns) of size N=25. Then, without resetting the seed, take a sample of the same size from and smoking mothers (dat.s). Compute the t-statistic (call it tval).
What is the absolute value of the t-statistic?
Here's how I did it:
set.seed(1)
dat.ns <- sample(bwt.nonsmoke,25)
dat.s <- sample(bwt.smoke,25)
tval <- t.test(dat.ns,dat.s)$statistic
tval
This gives the value 2.120904 which is apparently wrong. I also tried setting the seed to 1 before each sample as follows:
set.seed(1)
dat.ns <- sample(bwt.nonsmoke,25)
set.seed(1)
dat.s <- sample(bwt.smoke,25)
tval <- t.test(dat.ns,dat.s)$statistic
tval
which gives the t value of 1.573627 which is also wrong. I'm not sure what I'm doing wrong and I'd like some help.