I've searched several threads but have yet to find a solution.
I have a geom_bar plot with 40+ variables. I have created a separate df to tag each variable according to a specific category, and assigned a colour to the category. Across 40+ variables, there are 4 colours/categories included in the plot.
I would like the legend of the plot to show the colours of the categories, not the individual variables. I know I can accomplish this by having the colours/categories in the original df, however, I would like to be able to use the colour/category reference df in many different projects and avoid always having to add columns to the plotting dfs to tag categories and colours.
Here is an example where df is the data plotted, and df_cols is akin to my category/colour df. Ideally the legend would have "A=red, B=Blue, C=orange" and not variable names.
variable = c("abc", "def", "ghi", "jkl","mno", "pqr", "stu")
tag = c("A", "B", "C", "A","B", "A", "B")
colours = as.character(c("red", "blue", "orange", "red", "blue", "red", "blue"))
# Create colour reference df
df_cols = data.frame(variable, tag, colors = as.character(colours))
cols = df_cols$colors
cols = as.character(cols)
names(cols) = as.character(names(cols))
# Plotting df
df = data.frame(variable, value=c(1:7))
ggplot(df)+
geom_bar(aes(x=variable, y=value, fill=variable),stat = "identity")+
scale_fill_manual(values = cols)