0

I have search a way to improve the efficacity of my code. This is my data : Data =

Type District
A 1
B 1
A 2
C 1
B 1
C 2
A 2

I want to obtain a table like this :

1 2
A Freq Freq
B Freq Freq
C Freq Freq

With Freq the frequency of type (i.e A) for each District 1 and 2 (so the (1,1) case should be =1). My code is very "manual" now :

test1<-as.data.frame(table(Data[which(Data$Type=="A"),2]))
test2<-as.data.frame(table(Data[which(Data$Type=="B"),2]))
test3<-as.data.frame(table(Data[which(Data$Type=="C"),2]))
library(plyr)
test<-join_all(list(test1,test2,test3),by="Var1",type="left") #Var1 is created by R and corresponds to the districts
test <- data.frame(test[,-1], row.names = test[,1])

What I want to be able to do, is to find a function that can do this without having to create manually all these tes1/2/3 dataframes (because in this example I have 3 modalities, but for my real problem I have 9 Types for 31 districts so it is very inefficient). I imagine that whith maybe sapply or a function like that that would be good, but I don't know how to formulate the code. Can someone help me ?

Thanks you !

Léo
  • 11
  • 2

3 Answers3

1
library(tidyverse) 

df %>%  
  count(Type, District) %>%  
  pivot_wider(names_from = District, 
              values_from = n)

# A tibble: 3 x 3
  Type    `1`   `2`
  <chr> <int> <int>
1 A         1     2
2 B         2    NA
3 C         1     1
Chamkrai
  • 5,912
  • 1
  • 4
  • 14
1

using data.table package:

library(data.table)

dcast(as.data.table(df), Type ~ District, fun=length)

     Type     1     2
1:      A     1     2
2:      B     2     0
3:      C     1     1
  • Thanks you for your answer, it works but I have a first line where the Type is NA for no reason (I check my data and there is no NA in my column Type). – Léo Nov 05 '22 at 16:23
  • By _first line_ do you mean first row or column names? can you run `anyNA(df[, c("Type", "District")])` and let me know the answer? – B. Christian Kamgang Nov 05 '22 at 17:10
0
test <- as.data.frame(unclass(table(Data)))
Ric
  • 5,362
  • 1
  • 10
  • 23
  • That works perfectly thanks you ! However I don't really understand why you use unclass() here ? – Léo Nov 05 '22 at 16:32
  • Seems that casting funciton `as.data.frame` when given a matrix of class "table" (see `class(table(Data))`, try to expand the dimensions with `expand.grid` (do `print(as.data.frame.table)` to see source code) because that is the prefered format in most use cases, like filtering, plotting, etc. Thus, we need to `unclass` first it to remove the class, so the underlying matrix is cast "as is" and not expanded. – Ric Nov 05 '22 at 17:01
  • Also in R tables can be multi-dimensional arrays with more than two entries (when your data have more than two categorical columns), that is why it is prefered an output in long format that manages all cases. – Ric Nov 05 '22 at 17:03