0

I am not sure as to why this happens. Maybe it is just a simple mistake that I cannot see, but by using this code:

for filename in glob.glob('/Users/jacob/Desktop/MERS/new/NOT COAL/gensets/statistics_per_lgu/per_lgu_files/*.csv'):
    base = os.path.basename(filename)
    name = os.path.splitext(base)[0]
    df = pd.read_csv(filename)

    # Show 4 different binwidths
    for i, binwidth in enumerate([10, 20, 30, 40]):
        # Set up the plot
        ax = plt.subplot(2, 2, i + 1)

        plt.subplots_adjust( wspace=0.5, hspace=0.5)

        # Draw the plot
        ax.hist(df['New Capacity based on 0.8 PF'], bins=binwidth,
                color='red', edgecolor='black',alpha=0.5)

        # Title and labels
        ax.set_title('Histogram with Binwidth = %d' % binwidth, size=10)
        ax.set_xlabel('Capacity', size=11)
        ax.set_ylabel('Frequency count', size=11)

        ax.axvline(x=df['New Capacity based on 0.8 PF'].median(), linestyle='dashed', alpha=0.3, color='blue')
        min_ylim, max_ylim = plt.ylim()
        ax.text(x=df['New Capacity based on 0.8 PF'].median(),y= max_ylim*0.9, s='Median', alpha=0.7, color='blue',fontsize = 12)

        ax.axvline(x=df['New Capacity based on 0.8 PF'].mean(), linestyle='dashed', alpha=0.9, color='green')
        min_ylim, max_ylim = plt.ylim()
        ax.text(x=df['New Capacity based on 0.8 PF'].mean(),y= max_ylim*0.5, s='Mean', alpha=0.9, color='green',fontsize = 12)

        plt.tight_layout()
        plt.grid(True)
        plt.savefig('/Users/jacob/Documents/Gensets_gis/historgrams/per_lgu_files/{}.png'.format(name))

I get all files created like this attached photo here.

Any ideas as to what I've done wrong?

Thanks in advance. attached photo of one histogram output

My desired result would be something like this.

Desired output

meteo_96
  • 289
  • 2
  • 11
  • you have wrong indentations and you save 4 times file with the same name. You should save it once after you leave internal `for`-loop – furas Sep 21 '19 at 04:32
  • after looking at code and image I don't understand what is the problem. Every plot show histogram with different number of bins and the same mean and median because they use the same data from DataFrame - you could even calculate median and mean before loop. What is wrong with this attached image ? What result did you expect? – furas Sep 21 '19 at 04:41
  • @furas This was the problem my for loop should save each file with different a different histogram; instead, it places all histograms in the figure for each file it creates. – meteo_96 Sep 21 '19 at 04:54
  • if you want to save every histogram in different file then you don't have to create subplots for histograms, but you have to use different file names. Cuurrently you use the same name for all histograms. – furas Sep 21 '19 at 04:58
  • you could create single plot before `for`-loop - `ax = plt.subplot(1, 1, 1)` or `ax = plt.subplot()` and clear it before you draw histogram `ax.clear()`. As for files you could use `i` to create unique names `'{}-{}.png'.format(name, i)` – furas Sep 21 '19 at 05:10

1 Answers1

0

It doesn't create new subplots but it use previous ones and then it draw new plots on old plots so you have to use clear subplot before you draw new histogram.

ax = plt.subplot(2, 2, i + 1)
ax.clear()

Example code. It gives desired output but if you remove `ax.clear() then first image will be OK but you get new plot with old plots on second and third image.

import os
import pandas as pd
import matplotlib.pyplot as plt
import random

for n in range(3):
    filename = f'example_data_{n}.csv'
    base = os.path.basename(filename)
    name = os.path.splitext(base)[0]

    df = pd.DataFrame({'New Capacity based on 0.8 PF': random.choices(list(range(1000)), k=100)})

    data = df['New Capacity based on 0.8 PF']
    median = data.median()
    mean = data.mean()


    # Show 4 different binwidths
    for i, binwidth in enumerate([10, 20, 30, 40]):
        # Set up the plot
        ax = plt.subplot(2,2,i+1)

        ax.clear()  # <--- it removes previous histogram 

        plt.subplots_adjust( wspace=0.5, hspace=0.5)

        # Draw the plot
        ax.hist(data , bins=binwidth, color='red', edgecolor='black',alpha=0.5)

        # Title and labels
        ax.set_title('Histogram with Binwidth = %d' % binwidth, size=10)
        ax.set_xlabel('Capacity', size=11)
        ax.set_ylabel('Frequency count', size=11)

        min_ylim, max_ylim = plt.ylim()

        ax.axvline(x=median, linestyle='dashed', alpha=0.3, color='blue')
        ax.text(x=median, y= max_ylim*0.9, s='Median', alpha=0.7, color='blue',fontsize = 12)

        ax.axvline(x=mean, linestyle='dashed', alpha=0.9, color='green')
        ax.text(x=mean, y= max_ylim*0.5, s='Mean', alpha=0.9, color='green',fontsize = 12)

        plt.tight_layout()
        plt.grid(True)

    plt.savefig('{}.png'.format(name))
furas
  • 134,197
  • 12
  • 106
  • 148
  • Hi, thanks for this! It gives me the result for each bin per file But what I wanted to do was a 2 by 2 subplot that shows all the bin sizes already in one plot. Any idea how to edit this code you sent to do that? – meteo_96 Sep 21 '19 at 05:53
  • I don't understand what you want. Your original code creates 2 by 2 subplot and shows all bins. If you want all bins on one plot then why you need 2 by 2 subplots - you will have one plot and 3 empty places. If you want to draw all on one plot then create one subplot and don't clear it. And save it after `for`-loop. – furas Sep 21 '19 at 06:02
  • What exactly do I need to do in the code for this? I've tried many times but still comes out as the photo above – meteo_96 Sep 21 '19 at 06:57
  • I saw you desired output and now I see that code draws histogram on old subplot so you have two (or more) histograms on one subplot. You have to to clear subplot before you draw historgram. See my new answer. – furas Sep 21 '19 at 07:29