using pandas to create a multi-tile multi-series scatter chart

Question

Consider the following sample data frame:

rng = pd.date_range('1/1/2011', periods=72, freq='H')
df = pd.DataFrame({
        'cat': list('ABCD'*int(len(rng)/4)),
        'D1': np.random.randn(72),
        'D2': np.random.randn(72),
        'D3': np.random.randn(72),
        'D4': np.random.randn(72)
    }, index=rng)

I'm looking for an idiomatic way to scatter-plot this as following:

4 subplots (tiles), one for each category (A, B, C, or D)
each D series plotted in its own color

I can do this with a bunch of filtering and for-loops, but I'm looking for a more compact pandas-like way.

can you add reference lengthy code to understand what you want it to look like? — Marat, Mar 26 '17 at 02:47

piRSquared · Answer 1 · 2017-03-26T05:44:32.140

1

This is my guess at what you want.

fig, axes = plt.subplots(2, 2, figsize=(8, 6), sharex=True, sharey=True)

for i, (cat, g) in enumerate(df.groupby('cat')):
    ax = axes[i // 2, i % 2]
    for j, c in g.filter(like='D').iteritems():
        c.plot(ax=ax, title=cat, label=j, style='o')
    ax.legend(loc='best', fontsize=8)

fig.tight_layout()

edited Mar 26 '17 at 05:44

answered Mar 26 '17 at 05:34

piRSquared

285,575
57
475
624

Yes, this the outcome I want and it's is very similar to the code I have with nested for-loops and filtering (as I mentioned in my question). Is this the most minimalist one can go with pandas? Can `plot`'s `by` argument be leveraged for multi-tiles? Can `scatter` be made recognize columns as series? – Dmitry B. Mar 26 '17 at 16:35

using pandas to create a multi-tile multi-series scatter chart

1 Answers1