3

Let's say I have a dataframe with 100 rows and 40 columns where column 40 represents the Y axis values for the scatter plots. For 39 scatter plots, I would like to plot column 40 in function of column 1, column 40 in function of column 2, column 40 in function of column 3, etcetera up to column 40 in function of column 39. What would be the best way to produce such a subplot without having to do it all manually?

For example (with a smaller dataframe), trying to scatter plot column 3 in function of column 1 and column 3 in function of column 2 in a subplot.

df = pd.DataFrame({'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]})
df.plot(x=["AAA", "BBB"], y=["CCC"], kind="scatter", subplots=True, sharey=True)
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
A.P.
  • 461
  • 2
  • 8
  • 17

1 Answers1

7

One way would be to create the subplots externally and loop over the column names, creating a plot for each one of them.

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]})

fig, axes = plt.subplots(1,len(df.columns.values)-1, sharey=True)

for i, col in enumerate(df.columns.values[:-1]):
    df.plot(x=[col], y=["CCC"], kind="scatter", ax=axes[i])

plt.show()


Another method which might work in pandas 0.19 is to use the subplots argument. According to the documentation

subplots : boolean, default False
Make separate subplots for each column

I interprete this such that the following should work, however, I haven't been able to test it.

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]})

df.plot(x=df.columns.values[:-1], y=["CCC" for _ in df.columns.values[:-1]], 
                            kind="scatter", subplots=True, sharey=True)
plt.show()
ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
  • I have tested your second method and it works. Note to others: if it isn't clear from the answer by @ImportanceOfBeingErnest above, the x and y arguments must be provided lists of the same length. So, for example if you want to plot col1 against col2 and against col3, then you must have x=['col1','col1'] and y=['col2','col3'] instead of x='col1' and y=['col2','col3']. – Mishal Ahmed Oct 20 '21 at 14:43
  • Note also that subplots=True is not necessary. It will produce the same plot with or without subplots=True, as long as you have x=['col1','col1'] and y=['col2','col3']. – Mishal Ahmed Oct 20 '21 at 14:49
  • 1
    The second method did not work for pandas 1.3.5. It does create two axes, but it shows just one of them where all data are used. – T_T Jan 13 '22 at 02:16