0

ALL software version info

Python 3.7.4; On iMac (21.5-inch, 2017); Using IDLE.

Description of expected behavior and the observed behavior

Problem is: Different bins distribution between Matplotlib & Holoviews is obtained.

Screen Capture

Complete, minimal, self-contained example code that reproduces the issue

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from sklearn.datasets import load_wine

wine = load_wine()

print("Feature Names : ", wine.feature_names)

print("\nTarget Names : ", wine.target_names)

wine_df = pd.DataFrame(wine.data, columns = wine.feature_names)

wine_df["Target"] = wine.target

wine_df["Target"] = ["Class_1" if typ==0 else "Class_2" if typ==1 else "Class_3" for typ in wine_df["Target"]]

print("\nDataset Size : ", wine_df.shape)

print(wine_df.head())

Target1=wine_df.query('Target == "Class_1"')

Target2=wine_df.query('Target == "Class_2"')

Target3=wine_df.query('Target == "Class_3"')

x = Target1['proline']

y = Target2['proline']

z = Target3['proline']


plt.hist(x, bins=20,histtype='bar',color='blue',alpha=0.7,label='Class_1')

plt.hist(y, bins=20,histtype='bar',color='red',alpha=0.7,label='Class_2')

plt.hist(z, bins=20,histtype='bar',color='orange',alpha=0.7,label='Class_3')

plt.xlabel('proline')

plt.ylabel('Frequency')

plt.title('Malic Acid Distribution')

plt.legend(frameon=False)

plt.tight_layout()

plt.savefig("Test", dpi=300)

plt.show()


import holoviews as hv

hv.extension('bokeh')

from bokeh.plotting import show

from holoviews import dim, opts

import hvplot.pandas

hist=wine_df.hvplot.hist(y="proline", by="Target", width=600, height=400, ylim=(0,16), alpha=0.7, bins=20, ylabel="Frequency", title="Malic Acid Distribution")

show(hv.render(hist))
Anton Menshov
  • 2,266
  • 14
  • 34
  • 55
mmolet
  • 1
  • Holoviews is doing all this in one call and hence the 20 bins are spread among the full data set. For matplotlib you are calling hist 3 times so the bins are being set independently. Note that matplotlib hist also allows all three data sets to be passed in at once. – Jody Klymak Jul 12 '21 at 15:07
  • If that doesn’t work you can manually specify the bin edges – Jody Klymak Jul 12 '21 at 15:08
  • A potential solution: plt.hist(np.array([x, y, z],dtype=object), bins=20, histtype='step', stacked=False, fill=True,color=['blue', 'red', 'orange'],alpha=0.7,label=['Class_1', 'Class_2', 'Class_3']) – mmolet Jul 12 '21 at 15:54
  • Please check out how to write questions on stackoverflow. It's not the same as bug reports on GitHub. Explain more what you're trying to achieve and where you are stuck. – Cornelius Roemer Jul 13 '21 at 00:57

0 Answers0