0

I am new to R and I am trying to view gene expression differences between tumour vs normal using a TCGA.GTEX dataset. I am trying to find the log fold change for all columns.

This is the code I applied:

GELogFoldChanges <- apply(TCGA_GTEX_lung[-7], 2, function(x) log(sum(x[1:1011]/sum(x[1012:1299])))

But this error occurs:

Error in sum(x[1:1011]) : invalid 'type' (character) of argument

What does this error mean and how should I correct this code?

ad absurdum
  • 19,498
  • 5
  • 37
  • 60
Tah
  • 23
  • 6

1 Answers1

1

It seems to me some columns in TCGA_GTEX_lung[-7] are character. You need numeric to calculate sum. You can check this with code apply(df, 2, function(x) class(x)). If this is the case, you can convert character to numeric with as.numeric().

Edit

apply(df,2,function(x) log(sum(suppressWarnings(as.numeric(x))[1:1011]/sum(suppressWarnings(as.numeric(x))[1012:1299], na.rm = TRUE), na.rm = TRUE)))
 
OceanSky_U
  • 349
  • 3
  • 12
  • Hi! Thank you for responding, i tried to convert character to numeric using this code: lapply(TCGA_GTEX_lung,as.numeric) I'm not sure if i did this correctly because this came up: Warning messages: 1: In lapply(LTG, as.numeric) : NAs introduced by coercion. Also when i ran this code again: GELogFoldChanges <- apply(TCGA_GTEX_lung[-7],2,function(x) log(sum(x[1:1011]/sum(x[1012:1299]))). The same error showed up. – Tah Jan 30 '21 at 19:33
  • Take a look at this post (https://stackoverflow.com/questions/14984989/how-to-avoid-warning-when-introducing-nas-by-coercion), it explains what is the meaning of the warning "NAs introduced by coercion". In brief, this warning tells you some NAs were created during converting. I believe this is also the case for you. You can ignore the NAs when you compute `sum` (`na.rm = TRUE`). See the updated code above. @Tah – OceanSky_U Jan 30 '21 at 19:53
  • How do i write sum (na.rm = TRUE) into my code? lapply(LTG, as.numeric, sum (na.rm = TRUE) )? – Tah Jan 30 '21 at 20:01
  • I edited the code above based on your original `apply` function. Did you see it? `apply(TCGA_GTEX_lung[-7],2,function(x) log(sum(as.numeric(x)[1:1011]/sum(as.numeric(x)[1012:1299], na.rm = TRUE), na.rm = TRUE)) ` @Tah – OceanSky_U Jan 30 '21 at 20:04
  • Oh no sorry i missed it. I ran the code you gave me but in console a + sign appears now. What am i doing wrong?? – Tah Jan 30 '21 at 20:17
  • Your original code missed a `)` at end. I added it. Try this `apply(TCGA_GTEX_lung[-7],2,function(x) log(sum(as.numeric(x)[1:1011]/sum(as.numeric(x)[1012:1299], na.rm = TRUE), na.rm = TRUE))) `@Tah – OceanSky_U Jan 30 '21 at 20:24
  • Oh thanks! i tried it and got this warning: Warning messages: 1: In FUN(newX[, i], ...) : NAs introduced by coercion. – Tah Jan 30 '21 at 20:46
  • This is just warning when NAs were created. You can suppress it using `suppressWarnings(as.numeric(x))` as suggested in the post I pointed to `apply(df,2,function(x) log(sum(suppressWarnings(as.numeric(x))[1:1011]/sum(suppressWarnings(as.numeric(x))[1012:1299], na.rm = TRUE), na.rm = TRUE)))`. Did this solve your question? If so, please accept the answer. @Tah – OceanSky_U Jan 30 '21 at 20:53
  • Thank you so much for helping!! I think it worked – Tah Jan 30 '21 at 21:09
  • Sounds great! Can you accept my answer by clicking on the check mark beside my answer? @Tah – OceanSky_U Jan 30 '21 at 21:13