1

I am working with the golub dataset in R (separated by the AML and ALL) and I am attempting to do a hypothesis test in relation to two genes. For the AML patient group, I want to find out the proportion of patients who have a higher expression of gene 900 as compared to gene 1000, then I want to see if that out of those who have a higher expression value for gene 900, the number is less than half. I have a general idea to do the other half, and I had something like this for the first part, but seeing as its T/F I tried to switch it to numeric which gave 0 and 1 but I want the actual numbers and not in the logical form.

`gol.fac <- factor(golub.cl,levels=0:1, labels= c("ALL","AML"))
 x <- golub[900,gol.fac=="AML"]
 y <- golub[1000,gol.fac=="AML"]
 z <-golub[900,gol.fac=="AML"] > golub[1000,gol.fac=="AML"]
 k <- as.numeric(z)`
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
hkak
  • 13
  • 3

2 Answers2

0

Use max

max(golub[900,gol.fac=="AML"], golub[1000,gol.fac=="AML"])

Or if you have multiple values then use pmax

pmax(golub[900,gol.fac=="AML"], golub[1000,gol.fac=="AML"])
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

Instead of doing multiple slices of rows, just get the max by subsetting once

max(golub[900:1000, "AML"])
akrun
  • 874,273
  • 37
  • 540
  • 662