0

I have a Lorenz Curve graph that I filled by factor variables (male and female). This was done simply enough and overlapping was not an issue because there were only two factors.

Wage %>%
  ggplot(aes(x = salary, fill = gender)) +
  stat_lorenz(geom = "polygon", alpha = 0.65) +
  geom_abline(linetype = "dashed") +
  coord_fixed() +
  scale_fill_hue() +
  theme(legend.title = element_blank()) +
  labs(x = "Cumulative Percentage of Observations",
       y = "Cumulative Percentage of Wages",
       title = "Lorenz curve by sex")

This provides the following graph: Fill by two factors

However, when I have more than two factors (in this case four), the overlapping becomes a serious problem even if I use contrasting colors. Changing alpha does not do much at this stage. Have a look:

Wage %>%
  ggplot(aes(x = salary, fill = Diploma)) +
  stat_lorenz(geom = "polygon", alpha = 0.8) +
  geom_abline(linetype = "dashed") +
  coord_fixed() +
  scale_fill_manual(values = c("green", "blue", "black", "white")) +
  theme(legend.title = element_blank()) +
  labs(x = "Cumulative Percentage of Observations",
       y = "Cumulative Percentage of Wages",
       title = "Lorenz curve by diploma")

Fill by four factors

At this point I've tried all different color pallettes, hues, brewers, manuals etc. I've also tried reordering the factors but as you can imagine, this did not work as well.

What I need is probably a single argument or function to stack all these areas on top of each other so they all have their distinct colors. Funny enough, I've failed to find what I'm looking for and decided to ask for help.

Thanks a lot.

Emir Dakin
  • 148
  • 5
  • 1
    Do you have to use the `fill` and create the area or can you just use the `color` or `colour` arguments instead, giving you lines instead? – Érico Patto Nov 14 '20 at 13:33
  • Thanks @ÉricoPatto .When I do that, it automatically fills the area between the lines with black. Even if I delete the ```scale_fill_manual``` function. What do you think that might be causing it? I used ```ggplot(aes(x = salary, colour = Diploma))```. **Edit**: using ```stat_lorenz(desc = FALSE)``` instead of ```stat_lorenz(geom = "polygon", alpha = 0.8)``` fixes the issue. – Emir Dakin Nov 14 '20 at 13:50

1 Answers1

0

The problem was solved by a dear friend. This was done by adding the categorical variables layer by layer, without defining the Lorenz Curve as a whole.

ggplot() + scale_fill_manual(values = wes_palette("GrandBudapest2", n = 4)) +
  stat_lorenz(aes(x=Wage[Wage$Diploma==levels(Wage$Diploma)[3],]$salary, fill=Wage[Wage$Diploma==levels(Wage$Diploma)[3],]$Diploma), geom = "polygon") +
  stat_lorenz(aes(x=Wage[Wage$Diploma==levels(Wage$Diploma)[4],]$salary, fill=Wage[Wage$Diploma==levels(Wage$Diploma)[4],]$Diploma), geom = "polygon") +
  stat_lorenz(aes(x=Wage[Wage$Diploma==levels(Wage$Diploma)[2],]$salary, fill=Wage[Wage$Diploma==levels(Wage$Diploma)[2],]$Diploma), geom = "polygon") +
  stat_lorenz(aes(x=Wage[Wage$Diploma==levels(Wage$Diploma)[1],]$salary, fill=Wage[Wage$Diploma==levels(Wage$Diploma)[1],]$Diploma), geom = "polygon") +
  geom_abline(linetype = "dashed") +
  coord_fixed() +
  
  theme(legend.title = element_blank()) +
  labs(x = "Cumulative Percentage of Observations",
       y = "Cumulative Percentage of Wages",
       title = "Lorenz curve by diploma")

Which yields: Stacked factors

Emir Dakin
  • 148
  • 5