0

I have a data frame that consist of 1000 rows and 156 columns. I'm trying to subtract the first column to the next 38 columns, then subtract column 39 to the next 38, and so, but I can't find a way to do it. I'm only using ncdf4 and nothing else. Something like this

C1  C2  C3  C4  C5  C6  C7  C8
1   2   3   4   5   6   4   5
3   4   6   5   4   3   2   7

And I'd like it to be

C1  C2  C3  C4  C5  C6  C7  C8
0   1   2   3   4   5   3   4
0   1   3   2   1   0  -1   4

The logic would be First 38 columns - First column

Columns 39:77 - Column 39

and so on.

4 Answers4

0

Solved it by simply doing

{
  z[,1:38] <- z[,1:38]-z[,1]
  z[,39:77] <-z[,39:77]-z[,39]
  z[,78:118] <-z[,78:118]-z[,78]
  z[,119:156] <-z[,119:156]-z[,119]
}

Where z is the dataframe. Might not be the nicest way but it did the trick

0

Here is a user defined function: You can add else if statements as desired.

mydiff<-function(df){
  mydiff<-df
  for(i in 1:ncol(df)){
    if(i<=38){
      mydiff[,i]<-df[,i]-df[,1]
    }
    else if(i%in%c(39:77)){
      mydiff[,i]<-df[,i]-df[,39]
    }

    }

mydiff 
}

mydiff(df1)

Output:

 C1 C2 C3 C4 C5 C6 C7 C8
 0  1  2  3  4  5  3  4
 0  1  3  2  1  0 -1  4

Benchmark:

system.time(result<-as.tibble(iris2) %>% 
              select_if(is.numeric) %>% 
              mydiff())

Result:

 user  system elapsed 
   0.02    0.00    0.01 
NelsonGon
  • 13,015
  • 7
  • 27
  • 57
0

You can also do the following without any loop:

# sample data frame
df <- data.frame(matrix(data = seq(1,316),ncol = 158))

# split the data frame into list of data frame having columns
# 1 to 38, 39 to 77 and so on
df <- split.default(df, gl(round(ncol(df)/38),k = 38))

# subtract the last column from each
df <- do.call(cbind, lapply(df, function(f) f - f[,ncol(f)]))
colnames(df) <- paste0('C', seq(1,158))

print(head(df))

   C1  C2  C3  C4  C5
1 -74 -72 -70 -68 -66
2 -74 -72 -70 -68 -66
YOLO
  • 20,181
  • 5
  • 20
  • 40
0

You should consider using tidyverse to solve this, loading a package into R does little to the overhead of your environment and can make your life much easier.

 library(tidyverse)

> df %>% 
   mutate_at(.vars = vars(num_range(prefix = 'C', 1:38)), .funs = function(x) x - .$C1) %>% 
   mutate_at(.vars = vars(num_range(prefix = 'C', 39:77)), .funs = function(x) x - .$C39)

  C1 C2 C3 C4 C38 C39 C40 C41 C42 C77
1  0  1  2  3   4   0   1   2   3   4
2  0  0  3  2   4   0   0   3   2   4

Data

df <-
data.frame(
  C1 = c(1, 3),
  C2 = c(2, 3),
  C3 = c(3, 6),
  C4 = c(4, 5),
  C38 = c(5, 7),
  C39 = c(1, 3),
  C40 = c(2, 3),
  C41 = c(3, 6),
  C42 = c(4, 5),
  C77 = c(5, 7)
)
parkerchad81
  • 548
  • 3
  • 9