-1

Hello I have a matrix such as :

  COL1 COL2 COL3
A "A"  "B"  NA
B "B"  "B"  "C"
C NA   NA   NA
D "B"  "B"  "B"
E NA   NA   "C"
F "A"  "A"  "C"

and I would liek for each row (A,B,C,D etc) get the number of letters being A or B

exemple :

  Nb
A 2
B 2
C 0
D 3
E 0
F 2

does someone have an idea ?

Sotos
  • 51,121
  • 6
  • 32
  • 66
chippycentra
  • 3,396
  • 1
  • 6
  • 24

3 Answers3

2

another way is to use sapply:

df$n <- sapply(1:nrow(df), function(i) sum((df[i,] %in% c('A', 'B'))))

# COL1 COL2 COL3 n
# A    A    B <NA> 2
# B    B    B    C 2
# C <NA> <NA> <NA> 0
# D    B    B    B 3
# E <NA> <NA>    C 0
# F    A    A    C 2

You can achieve the same output by using purrr::map_dbl as well. Just replace sapply with map_dbl.

AlexB
  • 3,061
  • 2
  • 17
  • 19
1

You can try a base R solution with apply():

#Base R
df$Var <- apply(df,1,function(x) length(which(!is.na(x) & x %in% c('A','B'))))

Output:

  COL1 COL2 COL3 Var
A    A    B <NA>   2
B    B    B    C   2
C <NA> <NA> <NA>   0
D    B    B    B   3
E <NA> <NA>    C   0
F    A    A    C   2

Some data used:

#Data
df <- structure(list(COL1 = c("A", "B", NA, "B", NA, "A"), COL2 = c("B", 
"B", NA, "B", NA, "A"), COL3 = c(NA, "C", NA, "B", "C", "C")), row.names = c("A", 
"B", "C", "D", "E", "F"), class = "data.frame")

Or if you feel curious about tidyverse:

library(tidyverse)
#Code
df %>% mutate(id=1:n()) %>%
  left_join(df %>% mutate(id=1:n()) %>%
  pivot_longer(cols = -id) %>%
  filter(value %in% c('A','B')) %>%
  group_by(id) %>%
  summarise(Var=n())) %>% ungroup() %>%
  replace(is.na(.),0) %>% select(-id)

Output:

  COL1 COL2 COL3 Var
1    A    B    0   2
2    B    B    C   2
3    0    0    0   0
4    B    B    B   3
5    0    0    C   0
6    A    A    C   2
Duck
  • 39,058
  • 13
  • 42
  • 84
  • Thank you, and how can I replace it by thresgold, for instance if I want to count the number of value between( 0-4) and (6-10)? – chippycentra Sep 14 '20 at 14:47
  • @chippycentra You could set a `filter()` and in that case the `tidyverse` option should be useful. If I understand correctly you want to filter the final variable so that is between `( 0-4)` and `(6-10)` so I would add a new pipe like `filter(Var>0 & Var<4 | Var>6 & Var<10)` Let me know if that is what you are looking for! – Duck Sep 14 '20 at 14:52
1
library(dplyr)
 df <- structure(list(COL1 = c("A", "B", NA, "B", NA, "A"), COL2 = c("B", 
                                                                     "B", NA, "B", NA, "A"), COL3 = c(NA, "C", NA, "B", "C", "C")), row.names = c("A", 
                                                                                                                                                  "B", "C", "D", "E", "F"), class = "data.frame")
 df %>% 
   rowwise() %>% 
   mutate(sumVar = across(c(COL1:COL3),~ifelse(. %in% c("A", "B"),1,0)) %>% sum)
# A tibble: 6 x 4
# Rowwise: 
  COL1  COL2  COL3  sumVar
  <chr> <chr> <chr>  <dbl>
1 A     B     NA         2
2 B     B     C          2
3 NA    NA    NA         0
4 B     B     B          3
5 NA    NA    C          0
6 A     A     C          2
jyjek
  • 2,627
  • 11
  • 23