(in R) how to arrange column A based on column B

Question

I got a data frame like this:

Factory	Bread
A	a
A	a
B	c
B	b
B	d
C	a
D	e

I want to find name of the factory with the most number of bread

I write two codes and got different answers.

1.

df %>%
  group_by(factory, bread)%>%
  summarise(n = n())%>%
  arrange(desc(n))

df %>% 
  group_by(factory) %>% 
  mutate(number = length(unique(bread)))%>% 
  arrange(desc(number))

May I ask which one is the right code and why?

Thank you!!!!

Do you want `df %>% group_by(factory) %>% summarise(n = n_distinct(bread))` — akrun, Aug 08 '21 at 20:30
It works!!!! but now I got three different results for this question. May I ask wether there are some problems with the code I wrote before? — island1996, Aug 08 '21 at 20:33
Your second code is similar to mine i.e `length(unique` is `n_distinct`, but you created that as a column with `mutate`, where as I summarised with a single row per group. The first code in your post is basically giving the count of each combinations — akrun, Aug 08 '21 at 20:39

score 3 · Answer 1 · answered Aug 08 '21 at 20:30

3

We could use n_distinct from dplyr package:

library(dplyr)
df %>%
    group_by(factory)%>%
    summarise(bread = n_distinct(bread))

Output:

  factory bread
  <chr>   <int>
1 A           2
2 B           1
3 C           1

answered Aug 08 '21 at 20:30

TarJae

72,363
6
19
66

Thank Tarjae! But may I ask why the code I wrote are both wrong. – island1996 Aug 08 '21 at 20:34
`n_distinct` is best for these purposes. your code is not wrong! – TarJae Aug 08 '21 at 20:41

score 1 · Answer 2 · answered Aug 08 '21 at 20:52

1

A data.table option

> setorder(setDT(df)[, .(Bread = uniqueN(Bread)), Factory], -Bread)[]
   Factory Bread
1:       B     3
2:       A     1
3:       C     1
4:       D     1

answered Aug 08 '21 at 20:52

ThomasIsCoding

96,636
9
24
81

(in R) how to arrange column A based on column B

2 Answers2