3

I am a behavioral ecologist using R. I am trying to count the non-zero elements across several columns. Normally when I do this I have successfully employed colSums with the !=0 operator, as has been suggested in many posts here and elsewhere i.e.(Count the number of non-zero elements of each column).

However in this particular case I would really prefer to using piping - as this is simply one step of building out a much larger data frame- and I cannot seem to get colSums with the !=0 to play nicely with the piping. Is there anyway to get this to work or is there a more elegant alternative to counting non-zero values across columns living in the tidy-verse somewhere?

I have put some example code below to demonstrate. Thanks very much!

'''

#make example dataframe
example<- as.data.frame(matrix(sample(c(0,0,0,100),size=70,replace=T),ncol=7))
example<-cbind(Indentifier1="character a",Indentifier2="character b", example)
example


#works great but isn't piping friendly:
colSums(example[-c(1:2)] !=0)  

#Runs but because it lacks the !=0 operator it gives sums not counts of non-zero elements, can't figure out how to employ !=0:
example %>% select(-c(1:2)) %>% colSums() 


#gives me counts of all values, not sure how to call just non-zero values:
example %>% summarise(across(where(is.numeric), length))
 

'''

markus
  • 25,843
  • 5
  • 39
  • 58
N.F.F.
  • 65
  • 6

5 Answers5

2

example %>% summarise(across(where(is.numeric), ~sum(. != 0)))

Bloxx
  • 1,495
  • 1
  • 9
  • 21
  • 1
    This works perfectly! Thank you. Can you explain the function of the ~ and the . here, so I can employ this sort of method elsewhere? Thanks! – N.F.F. Mar 17 '22 at 22:58
  • 2
    Check out `?across`. The `~` is syntax to define a formula where the values (`.`) should be compared to 0, and the results summed -- in other words count the nonzeros. – Jon Spring Mar 17 '22 at 23:04
0

If you want to go the full magrittr way, read about ?alias.

library(magrittr)
example %>% 
  extract(-c(1, 2)) %>%
  equals(0) %>%
  not() %>%
  colSums()
# V1 V2 V3 V4 V5 V6 V7 
# 2  3  1  4  3  4  1 

data

set.seed(42)
example<- as.data.frame(matrix(sample(c(0,0,0,100),size=70,replace=T),ncol=7))
example<-cbind(Indentifier1="character a",Indentifier2="character b", example)
example


#works great but isn't piping friendly:
colSums(example[-c(1:2)] !=0) 
# V1 V2 V3 V4 V5 V6 V7 
# 2  3  1  4  3  4  1 
markus
  • 25,843
  • 5
  • 39
  • 58
  • 1
    Thanks! I think the magrittr package might mask some other things I need from other packages elsewhere in my script but filing this away for future use! – N.F.F. Mar 17 '22 at 23:01
0

Get the logical matrix you'd like to compute on, and pipe it to colSums()

(example |> select(-(1:2)) != 0) |> 
    colSums()

Of course this is tricky, because it relies on the implicit order of operations rule that forces example |> select(-(1:2)) to be evaluated before comparison to 0, and the explicit grouping with () required to evaluate the inequality before piping to colSums().

Martin Morgan
  • 45,935
  • 7
  • 84
  • 112
0

You can still use piping with base R code. For example, you could do something like this:

example %>%
  .[-c(1:2)] %>%
  replace(., . != 0, 1) %>%
  colSums()

#V1 V2 V3 V4 V5 V6 V7 
# 2  3  1  2  4  1  1 
AndrewGB
  • 16,126
  • 5
  • 18
  • 49
0

Just using base R (>= 4.1) you may do

example |>
  subset(select=-(1:2)) |>
  {\(.) colSums(. != 0)}()
# V1 V2 V3 V4 V5 V6 V7 
#  4  4  3  3  3  2  2 
jay.sf
  • 60,139
  • 8
  • 53
  • 110