0

I am quite new in R and now I am working on a long data set for a while and end up with a 2 Columns (Countries and Profit):

y <- data.frame(tapply(sub$Profit, sub$Country, sum)) 
y <- rename(y, Profit = tapply.sub.Profit..sub.Country..sum.)

y <- cbind(Country = rownames(y), y)
rownames(y) <- 1:nrow(y)

y <-  y %>% arrange(+Profit)
y

0         Country    Profit
1        Slovakia  56264.49
2      Luxembourg  59903.52
3         Ireland 104150.35
4          Sweden 109208.67
5         Finland 137918.93
6          Norway 159719.46
7        Portugal 199447.42
8     Netherlands 214398.10
9     Switzerland 248677.00
10 Czech Republic 286430.06
11        Denmark 305669.83
12        Belgium 316599.95
13         Poland 349640.12
14        Austria 397716.80
15          Italy 433439.35
16          Spain 520474.14
17         France 525408.81
18 United Kingdom 565622.63
19        Germany 643194.62

Now am trying to plot a barchart with it but am Strugling.

graph_country_profit <- ggplot(y, aes(x=Profit,y=Country)) + geom_col(width = 0.5, aes(fill="Profit"))
    
graph_country_profit

but the graph comes 1st. Pink, 2nd. With weird numbers. how can I fix it? Any explanations why is it so?? Would also be possible to order it increasing/decreasing?

Thank you for your time and help!

2 Answers2

2

The answer to why your plot is coloured pink is because you have quoted Profit inside your aes function. Here is me recreating your dataset:

data <- tibble(
  Country = c(
    "Slovakia", "Luxembourg", "Ireland", "Sweden", "Finland", "Norway", "Portugal",
    "Netherlands", "Switzerland", "Czech Republic", "Denmark", "Belgium", "Poland",
    "Austria", "Italy", "Spain", "France", "United Kingdom", "Germany"
  ),
  Profit = c(56264.49, 59903.52, 104150.35, 109208.67, 137918.93, 159719.46, 199447.42,
             214398.10, 248677.00, 286430.06, 305669.83, 316599.95, 349640.12, 397716.80,
             433439.35, 520474.14, 525408.81, 565622.63, 643194.62)
)

Here is me running your code:

ggplot(
  data, 
  aes(
    x=Profit,
    y=Country)
  ) + 
  geom_col(
    width = 0.5, 
    aes(
      fill = "Profit"
    )
  )

Example1

Lets run it again, but this time unquote the fill argument in the aes function

ggplot(
  data, 
  aes(
    x=Profit,
    y=Country)
  ) + 
  geom_col(
    width = 0.5, 
    aes(
      fill = Profit
    )
  )

Example2

Now each bar is coloured by profit instead.

The reason this happened is because the aesthetics function is looking for a variable to colour your plot by. By convention, you don't surround these with quote marks. If you do surround it with quote marks it won't recognise it as a variable and will just treat it as a random string. A random string contains no information on what bar should be coloured in what way and so ggplot just colours every bar the same colour.

By "weird numbers", I assume you mean the "2e+05", this just means 2x10^5 or 200000 if you haven't seen it before. The other answer from TarJae shows how to fix this using scale_y_continuous as well as how to reorder the factors.

Hugh Warden
  • 454
  • 4
  • 14
1
library(tidyverse)
library(scales)
ggplot(df, aes(y=Profit,x=fct_reorder(Country, Profit), fill=Profit)) + 
  geom_col(width = 0.5)+
  coord_flip()+
  scale_fill_gradient(low = "blue", high = "red", name = "Trade Value", labels = comma) +
  scale_y_continuous(labels = function(x) format(x, scientific = FALSE))+
  theme_classic()

enter image description here

TarJae
  • 72,363
  • 6
  • 19
  • 66