-1

I need help for creating a pie chart using mtcars$mpg and mtcars$carb. I want to do this; I want to create a pie chart showing the total mpg value for each carburetor. Let me explain more much; In the pie chart, each slice will show the value of the carburetors (1,2,3,4,6,8) and shape according to the total mpg value. I writed some of commands but how can i create a tablo with these, how should i continue? I need the best simple way for this. Please help me. Thanks...

> carb1 <- filter(mtcars, carb==1)
> carb2 <- filter(mtcars, carb==2)
> carb3 <- filter(mtcars, carb==3)
> carb4 <- filter(mtcars, carb==4)
> carb6 <- filter(mtcars, carb==6)
> carb8 <- filter(mtcars, carb==8)
> summpg_carb1 <- sum(carb1$mpg)
> summpg_carb2 <- sum(carb2$mpg)
> summpg_carb3 <- sum(carb3$mpg)
> summpg_carb4 <- sum(carb4$mpg)
> summpg_carb6 <- sum(carb6$mpg)
> summpg_carb8 <- sum(carb8$mpg)
src32
  • 1
  • 1

3 Answers3

2

Your attempt to sum everything for each of the six values of mtcars$carb in an extra line of code is not sizeable and it is error prone and overall bad style. There is a number of ways to aggregate data in R, amongst them the function aggregate:

aggr <- aggregate(mtcars$mpg, list(mtcars$carb), sum)
print(aggr)
pie(aggr$x, aggr$Group.1)

or the by function (in this particular case even a bit more comprehensive):

b <- by(mtcars$mpg, mtcars$carb, sum)
pie(b, names(b))
Bernhard
  • 4,272
  • 1
  • 13
  • 23
  • your answer is really helpful. but what is aggr$Group.1 's role there? I get the same result without it. And how can I continue the way I did it? because my method was obsessed :) Thank you very much. – src32 Dec 27 '20 at 13:36
  • No, you do not get the same result. The pie looks the same but the values `6` and `8` are replaced by the numbers `5` and `6`. There is no car with 5 carburators in the dataset. – Bernhard Dec 27 '20 at 13:42
  • For your more manual approach: First notice, that there are two different functions names `filter` depending on whether you use `dplyr` or not. The function name changes meaning once you open `dplyr`. No matter that, you need to bring all the values you calculated into one vector to call `pie`with. Forming a vector from single values can be done via the `c` function. – Bernhard Dec 27 '20 at 13:45
  • Okey i understood. "by" function is simplier. Thank you yo are amazing :). But when I use "by" function, How can i show percentages on the slice as names? And how can i calculate with vector, actually this was what I didn't know how to do :/ If you answer those too, I'll pray for you :))) – src32 Dec 27 '20 at 14:04
  • Assuming you used `dplyr` in your calculation above this should contain your results in a vector that works with `pie`: `pie(c(summpg_carb1, summpg_carb2, summpg_carb3, summpg_carb4, summpg_carb6, summpg_carb8), labels = c("one", "two", "three", "four", "six", "eight"))` – Bernhard Dec 27 '20 at 14:10
  • added the percentage aspect to my other answer, the one using `data.table`. – Bernhard Dec 27 '20 at 14:19
  • You are welcome. However, thanking people in comments ist discouraged on stackoverflow. Please read and follow the guidelines here: https://stackoverflow.com/help/someone-answers ;-) – Bernhard Dec 27 '20 at 16:34
2

Using ggplot2 and plotly:

# Install pacakges if they are not already installed: necessary_packages => vector
necessary_packages <- c("ggplot2", "plotly")

# Create a vector containing the names of any packages needing installation:
# new_pacakges => vector
new_packages <- necessary_packages[!(necessary_packages %in%
                                       installed.packages()[, "Package"])]

# If the vector has more than 0 values, install the new pacakges
# (and their) associated dependencies:
if(length(new_packages) > 0){install.packages(new_packages, dependencies = TRUE)}

# Initialise the packages in the session: list of boolean => stdout (console)
lapply(necessary_packages, require, character.only = TRUE)

# Aggregate the data.frame: 
agg_df <- transform(aggregate(mpg ~ carb, mtcars, sum),
                    carb = as.factor(paste(
                      carb, paste0(round(prop.table(mpg), 4) * 100, "%"),
                      sep = " - "
                    )))

# Chart aggregated data.frame: 
ggplot(agg_df, aes(x = "", y = mpg, fill = carb)) +
  geom_bar(width = 1, stat = "identity") +
  scale_fill_viridis_d(option = "viridis") +
  coord_polar("y", start = 0) +
  ylab("") +
  xlab("Total MPG") +
  ggtitle("Total MPG by Carburetor") +
  theme(
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    panel.border = element_blank(),
    panel.background = element_blank()
  ) 

# Plotly chart: 
plot_ly(aggregate(mpg ~ carb, mtcars, sum),
        labels = ~sort(carb), values = ~mpg, type = "pie",
        marker = list(colors=c("#440154FF", "#404788FF", "#2E6E8EFF", "#20A486FF", "#44BF70FF",
                               "#FDE725FF", "#20A387FF")),
        textinfo = "label+percent",
        textposition = "outside") %>% 
  layout(title = "Total MPG by Carburetor")
hello_friend
  • 5,682
  • 1
  • 11
  • 15
  • 1
    +1 : I like how carefull your code is with installing libraries that might already be there and how you import only the libraries needed. – Bernhard Dec 27 '20 at 16:41
  • @Bernhard thanks for the compliment, I like how straightforward your answer is (and also that it is Base R (+1)). – hello_friend Dec 28 '20 at 04:34
0

These kinds of questions tend to get three different answers on stackoverflow. One for standard R, one for R enhanced via dplyr and one for R enhanced via the data.table package. The first dplyr reaction was the comment by stefan. For completeness this is a data.table answer to round things off. data.table has a tendency to deliver the shortest code for people who are into that and often the fastest running code, which is of no value for small datasets like mtcars.

library(data.table)

mtcars.dt <- data.table(mtcars)
aggr <- mtcars.dt[,sum(mpg), carb][order(carb),]
pie(x = aggr$V1, labels = aggr$carb)
aggr

Your "manual" approch above translated to data.table and as a complete example could look like this:

library(data.table)
mtcars.dt <- data.table(mtcars)
aggr <- c( mtcars.dt[carb == 1, sum(mpg)],
           mtcars.dt[carb == 2, sum(mpg)],
           mtcars.dt[carb == 3, sum(mpg)],
           mtcars.dt[carb == 4, sum(mpg)],
           mtcars.dt[carb == 6, sum(mpg)],
           mtcars.dt[carb == 8, sum(mpg)]) 
perc <- 100 * round(aggr / sum(aggr), 3)
           
pie(aggr, labels = paste(perc, "%"), col = rainbow(6))
legend("topright", fill = rainbow(6), legend = c(1, 2, 3, 4, 6, 8), title = "carb")
Bernhard
  • 4,272
  • 1
  • 13
  • 23