1

I'm currently analyzing earthworm count data collected within an on-field experiment. My variables are style of site (experimental site vs. reference site), the year the data has been collected, the crop that was cultivated and the number of earthworms. I use groupby() to group the earthworms by style, year and crop and display it in a barchart.

The two groups of bars in the plot are too close together though, so I would like to increase spacing between them. How do I do that without changing the width?

I've tried to set up some code for it (at the end) but I always get the error:

"TypeError: bar() missing 1 required positional argument: 'height'"

However, when I define height, I just get big blue bars over my other groups of bars.

Here's my code:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np



data = {'Style': ["Experiment","Reference", "Experiment", "Reference", "Experiment","Reference",
              "Experiment", "Reference"],
        'Year': ["2021", "2021","2022","2022", "2021","2021", "2022", "2022"],
        'Crop': ["Rapeseed", "Rapeseed", "Rapeseed", "Rapeseed",
             "Maize", "Maize", "Maize", "Maize"],
        'Earthworms': [55, 2, 2,6,0,1,7,22]
       }

df = pd.DataFrame(data)

#Set graph properties
fig, ax = plt.subplots(figsize=(15,7))
colors = {"Maize": "#de8f05", "Rapeseed":"#d7bb19"}         
labels = list(colors.keys())


#Create yerr variable
yerr = [10.6926766215636, 1.4142135623731, 0.577350269189626,1.414213562, 0,
       0.707106781186548, 2.857738033, 4.43471156521669]

yerr = np.array(yerr).reshape(2,4)

#Groupby Year, Patchstyle, Crop (ind. variables), EW_num (dep. variable)
df = df.groupby(["Year", "Style", "Crop"])["Earthworms"].sum().unstack().plot.bar(ax=ax, color=colors,     yerr=yerr, width=0.9)


#Assign labels, axis ticks + limit, hide spines  

plt.ylabel("N", size=13, labelpad=10)
plt.yticks(fontsize=12)
plt.xticks(fontsize=12)
ax.set(xlabel=None)

plt.ylim(0,60)    
ax.spines.right.set_visible(False)
ax.spines.top.set_visible(False)

#Create space between two groups of bars
#n_groups = 2
#index = np.arange(n_groups)

#ax.bar(index, height)
Laura
  • 23
  • 3
  • What do you mean by "two groups of bars in the plot are too close together"? Are you looking to add space between Maize and Rapeseed bars? – Redox Aug 22 '23 at 11:05
  • No, I want to add space between the different clusters of bars - between the "Experiment 2021" and "Reference 2021" bar groups for instance. If you run the code I think it'll make sense. – Laura Aug 22 '23 at 14:41
  • @Laura FYI I have added a solution if you are still interested in one for this question – John Collins Sep 02 '23 at 06:53

1 Answers1

0

As far as keeping this within pandas plotting, I found simply removing the width parameter from the plot function seems to result in pretty good spacing between the groups:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd


data = {
    "Style": [
        "Experiment",
        "Reference",
        "Experiment",
        "Reference",
        "Experiment",
        "Reference",
        "Experiment",
        "Reference",
    ],
    "Year": ["2021", "2021", "2022", "2022", "2021", "2021", "2022", "2022"],
    "Crop": [
        "Rapeseed",
        "Rapeseed",
        "Rapeseed",
        "Rapeseed",
        "Maize",
        "Maize",
        "Maize",
        "Maize",
    ],
    "Earthworms": [55, 2, 2, 6, 0, 1, 7, 22],
}

df = pd.DataFrame(data)

# Set graph properties
fig, ax = plt.subplots(figsize=(15, 7))
colors = {"Maize": "#de8f05", "Rapeseed": "#d7bb19"}
labels = list(colors.keys())


# Create yerr variable
yerr = [
    10.6926766215636,
    1.4142135623731,
    0.577350269189626,
    1.414213562,
    0,
    0.707106781186548,
    2.857738033,
    4.43471156521669,
]

yerr = np.array(yerr).reshape(2, 4)

# Groupby Year, Patchstyle, Crop (ind. variables), EW_num (dep. variable)
df = (
    df.groupby(["Year", "Style", "Crop"])["Earthworms"]
    .sum()
    .unstack()
    .plot.bar(ax=ax, color=colors, yerr=yerr)
)


# Assign labels, axis ticks + limit, hide spines
plt.ylabel("N", size=13, labelpad=10)
plt.yticks(fontsize=12)
plt.xticks(fontsize=12)
ax.set(xlabel=None)

plt.ylim(0, 60)

ax.spines.right.set_visible(False)
ax.spines.top.set_visible(False)

Pandas groupby bar chart

However, perhaps you might be interested in using plotly (no more difficult to install than pandas; pip install plotly).

E.g.,:

import plotly.graph_objects as go

df = df.groupby(["Year", "Style", "Crop"])["Earthworms"].sum().unstack()
x = [[x[n] for x in df.index.values] for n in range(2)]

fig = go.Figure()
fig.add_bar(x=x, y=df.Maize, name="Maize", marker_color="#de8f05")
fig.add_bar(x=x, y=df.Rapeseed, name="Rapeseed", marker_color="#d7bb19")
fig.update_layout(barmode="group")
fig.show()

which results in a more professional looking visualization:

Plotly version of same bar chart

Error bars can of course be added too (just add error_y=dict(type='data', array=yerr) [See the plotly docs here]).

John Collins
  • 2,067
  • 9
  • 17