1

Gadfly does not seem to use the (level) order of categorical variables:

using CSV
using DataFrames
using Gadfly
using HTTP

url = "https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv"

tips = CSV.File(HTTP.get(url).body) |> DataFrame
categorical!(tips, :day)
ordered!(tips.day, true)
levels!(tips.day, ["Thur", "Fri", "Sat", "Sun"])

Gadfly.plot(tips, x=:day, y=:total_bill, color=:smoker, Geom.boxplot)

enter image description here

Should the plot not inherit the order specified in the categorical variable?

I found a way to order the categorical values, but that feels a little 'buggy' because of specifying the order again.

Gadfly.plot(tips, x=:day, y=:total_bill, color=:smoker, Geom.boxplot,
    Scale.x_discrete(levels=levels(tips.day)))

enter image description here

Any suggestions how to solve this?

René
  • 4,594
  • 5
  • 23
  • 52

1 Answers1

0

In Gadfly, for discrete x the order of the values is determined by their order in the dataframe (so currently the level order in the CategoricalArray is not supported). It might not be supported in the future, because DataFrames plans to drop CategoricalArrays (https://github.com/JuliaData/DataFrames.jl/issues/2321).

Mattriks
  • 171
  • 4
  • I don't see how the fact that DataFrames depends on CategoricalArrays or not matters for Gadfly. See similar issue for Plots.jl: https://github.com/JuliaData/CategoricalArrays.jl/issues/256 – Milan Bouchet-Valat Nov 08 '20 at 21:38