0

I am a newbie in programming with R, and this is my first question ever here on Stackoverflow.

Let's say that I have a data frame with 4 columns:
(1) Individual ID (numeric);
(2) Morality of the individual (factor);
(3) The city (factor);
(4) Numbers of books possessed (numeric).

Person_ID <- c(1,2,3,4,5,6,7,8,9,10) 
Morality <- c("Bad guy","Bad guy","Bad guy","Bad guy","Bad guy",
          "Good guy","Good guy","Good guy","Good guy","Good guy") 
City <- c("NiceCity", "UglyCity", "NiceCity", "UglyCity", "NiceCity", 
         "UglyCity", "NiceCity", "UglyCity", "NiceCity", "UglyCity") 
Books <- c(0,3,6,9,12,15,18,21,24,27)
mydf <- data.frame(Person_ID, City, Morality, Books)

I am using this code in order to get the counts by each category for the variable Morality in each city:

mycounts<-melt(mydf,
               idvars = c("City"),
               measure.vars = c("Morality"))%>%
  dcast(City~variable+value,
        value.var="value",fill=0,fun.aggregate=length)

The code gives this kind of table with the sums:

names(mycounts)<-gsub("Morality_","",names(mycounts))
mycounts
      City Bad guy Good guy
1 NiceCity       3        2
2 UglyCity       2        3

I wonder if there is a similar way to use dcast() for numerical variables (inside the same script) e.g. in order to get a sum the Books possessed by all individuals living in each city:

#> City Bad guy Good guy Books
#>1 NiceCity 3 2 [Total number of books in NiceCity]
#>2 UglyCity 2 3 [Total number of books in UglyCity]

1 Answers1

0

Do you mean something like this:

mydf %>% 
  melt(
    idvars = c("City"),
    measure.vars = c("Morality")
  ) %>%
  dcast(
    City ~ variable + value,
    value.var = "Books", 
    fill = 0,
    fun.aggregate = sum
  )
#>       City Morality_Bad guy Morality_Good guy
#> 1 NiceCity               18                42
#> 2 UglyCity               12                63
Iaroslav Domin
  • 2,698
  • 10
  • 19
  • Hi, thank you for the reply. That was not my question, though. I apologize if I wasn't clear enough. I would like to get a table with two columns with the counts for each category of the categorical variable `Morality` and a third column showing the sum for the numerical variable `Books`, in each city. – newbie_but_wtl Nov 08 '19 at 18:22
  • `missing_column <- aggregate(Books ~ City, mydf, sum)` , this calculates the total `Books` by city. Yet, if I add this code (`%>%`) e.g. to your code, it won't work. – newbie_but_wtl Nov 08 '19 at 22:03