0

I have trade data for year, months, commodity, and quantity. I want to create the total quantity, x_total, per commodity, per month, per year and have it appear as a new variable with the same number for each observation within that group.

For example:

What I have:

Year Month Commodity X_Quantity
2010   1     apples    10
2010   1     bananas    5 
2010   2     apples     9 
2010   2     bananas    4

What I want to see is:

Year Month Commodity X_Quantity X_total
2010   1     apples     10        15
2010   1     bananas     5        15
2010   2     apples      9        13
2010   2     bananas     4        13 

my code so far looks like:

totals <- original.data [c("Year", "Month", "Commodity", "X_Quantity")] %>%
  group_by(Year, Month, Commodity)%>%
  summarise(X_total=sum(X_Quantity)) %>%
  arrange(year, month, desc(X_total)) %>%
  ungroup() 

I have been using mutate to create previous variables.

I'd love to keep the X_Quantity variable to eventually create a X_share variable by dividing the quantity by the X_total for each Commodity.

I hope that makes sense, please forgive any posting errors I've committed (this is my first post).

Thanks so much in advance.

hraw45
  • 3
  • 3

1 Answers1

0

Try this. You need to group by Year and Month to get the expected output. Here the code:

library(dplyr)
#Code
newdf <- Totals %>% group_by(Year,Month) %>% mutate(X_total=sum(X_Quantity),
                                           X_share=X_Quantity/X_total)

Output:

# A tibble: 4 x 6
# Groups:   Year, Month [2]
   Year Month Commodity X_Quantity X_total X_share
  <int> <int> <chr>          <int>   <int>   <dbl>
1  2010     1 apples            10      15   0.667
2  2010     1 bananas            5      15   0.333
3  2010     2 apples             9      13   0.692
4  2010     2 bananas            4      13   0.308

Some data used:

#Data
Totals <- structure(list(Year = c(2010L, 2010L, 2010L, 2010L), Month = c(1L, 
1L, 2L, 2L), Commodity = c("apples", "bananas", "apples", "bananas"
), X_Quantity = c(10L, 5L, 9L, 4L)), class = "data.frame", row.names = c(NA, 
-4L))
Duck
  • 39,058
  • 13
  • 42
  • 84