1

I am new to Julia and was wondering how to pass labels to the Histogram function in Plots.jl package.

using Plots
gr()
histogram(
    data[:sentiment_labels],
    title = "Hstogram of sentiment labels",
    xlabel = "Sentiment",
    ylabel = "count",
    label = ["Negative" "Positive" "Neutral"],
    fillcolor = [:coral,:dodgerblue,:slategray]
)

Only the first labels "Negative" appears in the plot.

enter image description here

imantha
  • 2,676
  • 4
  • 23
  • 46

1 Answers1

2

So the short answer is: there's only one label in your plot because there's only one data series in your plot - a histogram only plots one data series, which has one label attached to it. It might seem a bit unusual that you get multiple colours but only one legend, so I'll break down why that happens as it's instructive and a frequent source of confusion for Plots.jl users I believe:

It is a bit of a coincidence that you are getting three different colours for the bars you are plotting. What happens here is that you are providing a single color argument that is cycled through for the bars in the histogram. You can see this if you provide more colours to your histogram call:

using Plots

sentiment_labels = [fill(-1, 200); fill(0, 700); fill(1, 100)]

histogram(
    sentiment_labels,
    fillcolor = [:coral, :red, :green, :dodgerblue, :slategray]
)

gives:

enter image description here

What's happening here? We have provided five colours, and it turns out that your histogram only has a bar every five increments (there are bins between -1, 0, and 1, it's just that there are zero observations in those bins). Therefore every fifth bar has the same colour, and with the zero bars disappearing, we only end up with one colour visible in the plot.

Another way of seeing this is having data that's more continuous than your sentiment labels:

cont_data = rand(1_000)

histogram(
    cont_data,
    fillcolor = [:coral, :red, :green, :dodgerblue, :slategray]
)

gives:

enter image description here

So actually there's only one colour argument passed in here. The crucial difference between colours and labels in your histogram call is that one is a row, the other a column vector:

julia> ["Negative" "Neutral" "Positive"]
1×3 Array{String,2}:
 "Negative"  "Neutral"  "Positive"

julia> [:coral, :slategrey, :dodgerblue]
3-element Array{Symbol,1}:
 :coral
 :slategrey
 :dodgerblue

Plots will interpret the first of these as applying to three different series ("Negative" is the label for the first series, "Neutral" for the second, "Positive" for the third), while it interprets the second as applying to one series only (so :coral, :slategrey, :dodgerblue are all colours for the first series passed in. This is quite a subtle distinctions in Plots.jl, which often trips people up (me included!)

To get three labels, you should therefore have three series for which you plot histograms. One way of doing this is to split your vector of sentiment labels into three vectors:

histogram(
    [filter(x -> x == y, sentiment_labels) for y ∈ -1:1],
    fillcolor = [:coral :dodgerblue :slategray],
    label = ["Negative" "Positive" "Neutral"]
)

gives:

enter image description here

Although I would probably argue that in your case a histogram isn't the right tool - if your labels are only ever going to be negative, neutral and positive, a simple bar chart will do, as you don't need the automatic binning functionality that a histogram provides. So I would probably do:

bar(
    title = "Count of sentiment labels",
    xlabel = "Sentiment",
    ylabel = "count",
    [-1 0 1], [[sum(sentiment_labels .== x)] for x ∈ -1:1],
    label = ["Negative" "Positive" "Neutral"],
    fillcolor = [:coral :dodgerblue :slategray],
    linecolor = [:coral :dodgerblue :slategray],
    xticks = -1:1
)

to get:

enter image description here

Nils Gudat
  • 13,222
  • 3
  • 39
  • 60
  • If I had only one data series but I want to change that 'y1' to something else, what do I need to change? legend does not seem to work on histogram. – newtothis Sep 27 '21 at 16:56