6

I am manipulating DataFrame using pandas, Python. My data is 10000(rows) X 20(columns) and I am visualizing it, like this.

df.hist(figsize=(150,150))

However, if I make figsize bigger, each of subplots' title, which is name of each columns, get really small or graphs overlap each other and it makes impossible to distinguish.

Is there any clever way to fix it?

Thank you!

jayko03
  • 2,329
  • 7
  • 28
  • 51
  • https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.hist.html There's a `xlabelsize` parameter, is that what you want? – cs95 Sep 13 '17 at 04:33
  • @cᴏʟᴅsᴘᴇᴇᴅ I am not sure, where can I use that parameter(?) ?? – jayko03 Sep 13 '17 at 04:47
  • As the second argument to `hist` – cs95 Sep 13 '17 at 05:00
  • @cᴏʟᴅsᴘᴇᴇᴅ It does not work. Titles of subplot are same as name of each columns. I am going to edit my post to clarify my quiestion. – jayko03 Sep 13 '17 at 05:06

2 Answers2

15

There could be cleaner ways. Here are two ways.

1) You could set properties of subplots like

fig = df.hist(figsize=(50, 30))
[x.title.set_size(32) for x in fig.ravel()]

enter image description here

2) Another way, is to set matplotlib rcParams default parameters

import matplotlib

params = {'axes.titlesize':'32',
          'xtick.labelsize':'24',
          'ytick.labelsize':'24'}
matplotlib.rcParams.update(params)
df.hist(figsize=(50, 30))

enter image description here


Default Issue

This is default behavior with very small labels and titles in subplots.

matplotlib.rcParams.update(matplotlib.rcParamsDefault)  # to revert to default settings
df.hist(figsize=(50, 30))

enter image description here

Zero
  • 74,117
  • 18
  • 147
  • 154
2

I would not recommend to make the figure much larger then 10 inch in each dimension. This should in any case be more than enough to host 20 subplots. And not making the figure so large will keep fontsize reasonable.
In order to prevent plot titles from overlappig, you may simply call plt.tight_layout().

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(1000,20))
df.hist(figsize=(10,9), ec="k")

plt.tight_layout()
plt.show()

enter image description here

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
  • @JohnGalt There is no specific style in use. The code above the picture together with a standard matplotlib install gives the image shown. Does it look differently when you run that piece of code? – ImportanceOfBeingErnest Sep 13 '17 at 13:39
  • @JohnGalt I see. For that bit you may be interested in [this post](https://stackoverflow.com/questions/43080259/no-outlines-on-bins-of-matplotlib-histograms-or-seaborn-distplots/43080772#43080772) – ImportanceOfBeingErnest Sep 13 '17 at 14:05