0

I have a database like this:

Year   Type   Return
1900     A      4
1900     B      7
1901     A      87
1901     B      3
1902     A      9.7
1902     B      2

I would like to transform the return absolute values into yearly percentages.

Eg. For 1900 I would like to have:

4/(4+7) = 0,36
7/(4+7) = 0,64

And:

Year   Type   Return
1900     A      36%
1900     B      64%

The real database has >10k lines of rows.

zx8754
  • 52,746
  • 12
  • 114
  • 209
PCRL
  • 111
  • 5
  • library(dplyr); Df %>% group_by(year) %>% mutate(res = return/sum(return)). You can add * 100 if you want percentages for report; if you want to analyze further easier to use ratios. Df is the data frame with the data. – missuse Nov 02 '21 at 09:39

1 Answers1

0

If the Type is always the seq A,B:

Data:

df <- data.frame(data.table::fread("Year Type Return
1900 A 4
1900 B 7
1901 A 87
1901 B 3
1902 A 9.7
1902 B 2"))

Code:

library(tidyverse)
df %>% 
  group_by(Year) %>% 
  summarise(Return = Return / sum(Return)) %>% 
  ungroup() %>% 
  mutate(Type = df$Type) %>% 
  relocate(Year, Type, Return) %>% 
  mutate(Return = round(Return, 2))

Output

   Year Type  Return
  <int> <chr>  <dbl>
1  1900 A       0.36
2  1900 B       0.64
3  1901 A       0.97
4  1901 B       0.03
5  1902 A       0.83
6  1902 B       0.17

A prettier output:

df %>% 
  group_by(Year) %>% 
  summarise(Return = Return / sum(Return)) %>% 
  ungroup() %>% 
  mutate(Type = df$Type) %>% 
  relocate(Year, Type, Return) %>% 
  mutate(Return = scales::percent(Return))

   Year Type  Return
  <int> <chr> <chr> 
1  1900 A     36%   
2  1900 B     64%   
3  1901 A     97%   
4  1901 B     3%    
5  1902 A     83%   
6  1902 B     17% 
MonJeanJean
  • 2,876
  • 1
  • 4
  • 20