1

I have this matrix.

mat<-c("A","NODATA","NODATA","NODATA","A","B","C","NODATA","A","B","C","NODATA","D","E","A","NODATA","D","B","A","NODATA")
mat2 <- matrix(mat<-c("A","NODATA","NODATA","NODATA","A","B","C","NODATA","A","B","C","NODATA","D","E","A","NODATA","D","B","A","NODATA"),nrow = 4,ncol = 5)
mat3<-t(mat2)
colnames(mat3)<-c("col1","col2","col3","col4")
mat3

     col1 col2     col3     col4    
[1,] "A"  "NODATA" "NODATA" "NODATA"
[2,] "A"  "B"      "C"      "NODATA"
[3,] "A"  "B"      "C"      "NODATA"
[4,] "D"  "E"      "A"      "NODATA"
[5,] "D"  "B"      "A"      "NODATA"

I want to change dataframe as below in R.

A B C D E NODATA
1 0 0 0 0 1
1 1 1 0 0 1
1 1 1 0 0 1
1 0 0 1 1 1
1 1 0 1 1 1

do you know any idea ?

thank you.

h-y-jp
  • 199
  • 1
  • 8

2 Answers2

2
library(dplyr)
data.frame(rows=seq_len(nrow(mat3))[row(mat3)], values=c(mat3)) %>%
  mutate(a=1) %>%
  pivot_wider(id_cols="rows", names_from="values", values_from="a", values_fn=list(a=length)) %>%
  mutate_all(~ +!is.na(.)) %>%
  select(-rows) %>%
  select(sort(colnames(.)))
# # A tibble: 5 x 6
#       A     B     C     D     E NODATA
#   <int> <int> <int> <int> <int>  <int>
# 1     1     0     0     0     0      1
# 2     1     1     1     0     0      1
# 3     1     1     1     0     0      1
# 4     1     0     0     1     1      1
# 5     1     1     0     1     0      1

The first line (data.frame(...)) suggested by https://stackoverflow.com/a/26838774/3358272.

r2evans
  • 141,215
  • 6
  • 77
  • 149
2

Here is a base R approach to this. We first create an empty matrix of zeroes with dimensions determined by number of columns of unique characters in original matrix. Then, we convert the matrix to pairs of "coordinates" (row, column pairs) that indicate where 1 should be placed and substitute.

mat3_pairs <- cbind(c(row(mat3)), c(mat3))
new_mat <- matrix(rep(0, length(unique(mat3_pairs[,2])) * nrow(mat3)), nrow = nrow(mat3))
colnames(new_mat) <- sort(unique(df$col))
rownames(new_mat) <- as.character(1:nrow(mat3))
new_mat[mat3_pairs] <- 1
new_mat

Output

  A B C D E NODATA
1 1 0 0 0 0      1
2 1 1 1 0 0      1
3 1 1 1 0 0      1
4 1 0 0 1 1      1
5 1 1 0 1 0      1
Ben
  • 28,684
  • 5
  • 23
  • 45