I have a data.table that is quite large and have been attempting to return a list or vector of counts of specific default values in each column (they vary per column). It is organized as such:
set.seed(1);
DT = as.data.table(matrix(sample(1:100, 100*100, TRUE), 100, 100))
#DT output below
param1 param2 param3 ... param100 #column names
1 1 1 ... 1 #first row = default values
. #elems in remaining rows are random
. # a param can be set to non default
1 666 1 ... 143 # or default values within a column
.
.
10000 1 1 ... 420
I am curious to know what a data.table way of doing this is? I have been sifting through past documentation, and am attempting to avoid for loops and methods that are intensive in memory and computation (I've attempted to use filter, lapply, and grouping, without luck).
An analogous example of what I am ideally looking for is with counting the number of non-NA values that exist per column:
count <- colSums(!is.na(DT))
#which outputs the following:
print(count)
param1 param2 param3 ... param177
1 292 0 7
Is there a way to do this similar to the colSums(!is.na(DT))
method, except for a given default value specific to each column? So instead of counting non-NA values for a given column, I would be counting for non-default values that appear in each column of my DT, where each default value pertaining to each column is located on the first row.