0

My goal is to plot something similar to the top graph of the following link.

I have several txt files, every one of them corresponding to a different sample. Currently, I have my data loaded as pandas dataframes (although I'm not sure if it could be easier if I had loaded as numpy arrays):

sample4.head()
Out[61]: 
           20       40       60       80      100
x                                                
1.10  1.09734  1.25772  1.41810  1.57847  1.73885
1.11  1.06237  1.21307  1.36378  1.51448  1.66518
1.12  1.02176  1.16346  1.30516  1.44686  1.58856
1.13  0.97769  1.11097  1.24426  1.37754  1.51083
1.14  0.93162  1.05702  1.18241  1.30781  1.43321

test5.head()
Out[62]: 
           20       40       60       80      100
x                                                
1.10  1.12427  1.31545  1.50663  1.69781  1.88899
1.11  1.06327  1.24045  1.41763  1.59482  1.77200
1.12  0.99875  1.16302  1.32730  1.49158  1.65585
1.13  0.93276  1.08509  1.23742  1.38975  1.54208
1.14  0.86668  1.00792  1.14916  1.29040  1.43164

test6.head()
Out[63]: 
           20       40       60       80      100
x                                                
1.10  1.08463  1.30038  1.51612  1.73187  1.94761
1.11  0.99905  1.19626  1.39346  1.59067  1.78788
1.12  0.91255  1.09283  1.27310  1.45337  1.63365
1.13  0.82706  0.99181  1.15656  1.32131  1.48605
1.14  0.74381  0.89429  1.04477  1.19525  1.34572

As it can be seen, all samples share one column. The following approach works for a single sample, giving a simple 2D plot:

sample4.plot()

But my idea is to plot all dataframes I have along the y axis, meaning that the y axis should be each of the individual samples I have, in a 3d graph like the example above, but I don't know how to "stack" dataframes and plot them using a third axis.

Any help would be appreciated.

Thanks in advance.

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
Maxwell's Daemon
  • 587
  • 1
  • 6
  • 21

1 Answers1

4

Here's one approach, using melt and Axes3D.

First, generate the sample data provided by OP:

import pandas as pd
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

sample4_z = [1.09734,  1.25772,  1.4181 ,  1.57847,  1.73885,  1.06237,
             1.21307,  1.36378,  1.51448,  1.66518,  1.02176,  1.16346,
             1.30516,  1.44686,  1.58856,  0.97769,  1.11097,  1.24426,
             1.37754,  1.51083,  0.93162,  1.05702,  1.18241,  1.30781,  
             1.43321]

test5_z = [1.12427,  1.31545,  1.50663,  1.69781,  1.88899,  1.06327,
           1.24045,  1.41763,  1.59482,  1.772  ,  0.99875,  1.16302,
           1.3273 ,  1.49158,  1.65585,  0.93276,  1.08509,  1.23742,
           1.38975,  1.54208,  0.86668,  1.00792,  1.14916,  1.2904 , 
           1.43164]

test6_z = [1.08463,  1.30038,  1.51612,  1.73187,  1.94761,  0.99905,
           1.19626,  1.39346,  1.59067,  1.78788,  0.91255,  1.09283,
           1.2731 ,  1.45337,  1.63365,  0.82706,  0.99181,  1.15656,
           1.32131,  1.48605,  0.74381,  0.89429,  1.04477,  1.19525,  
           1.34572]

def make_df(data):
    x = [1.1, 1.11, 1.12, 1.13, 1.14]
    y = [20, 40, 60, 80, 100]
    z = np.array(data).reshape((len(x),len(y)))
    return pd.DataFrame(z, index=x, columns=y).reset_index().rename(columns={'index':'x'})

sample4 = make_df(sample4_z)
test5 = make_df(test5_z)
test6 = make_df(test6_z)

Now plot all three data frames on one 3D grid:

# signal to pyplot that we want 3d plots
fig, ax = plt.subplots(1, 1, figsize=(10, 10), subplot_kw={'projection': '3d'})

# convenience wrapper for plotting function
def plot_3d(df):
    ax.plot(df.x, df.y.astype(float), df.z) # dims must be floats

# reshape with melt(), then plot
plot_3d(pd.melt(sample4, id_vars='x', var_name='y', value_name='z'))
plot_3d(pd.melt(test5, id_vars='x', var_name='y', value_name='z'))
plot_3d(pd.melt(test6, id_vars='x', var_name='y', value_name='z'))

# label axes
ax.set_xlabel('x', fontsize=20)
ax.set_ylabel('y', fontsize=20)
ax.set_zlabel('z', fontsize=20)

# optional view configurations
ax.elev = 10
ax.axim = 20

3d plot

UPDATE re: y-axis as categorical
With only two continuous-valued axes, it's generally not necessary (nor recommended) to invoke a 3D plotting surface (see, for example, this similar discussion). It's clearer to encode the categorical variable as a labeled dimension.

This case is additionally complicated by the sample group levels, which represent a fourth dimension. I'd suggest considering a panel of plots, with y-axis categories encoded as legends. Like this:

datasets = ['sample4','test5','test6']
line_types = ['-.','--','-']
fix, axes = plt.subplots(1,3, figsize=(14,5))
for i, data in enumerate([sample4, test5, test6]):
    data.set_index('x').plot(style=line_types[i], ax=axes[i], sharey=True, 
                             xticks=data.x, title=datasets[i])

panel plots

Still, if you really want to keep things in 3D, a scatter plot with the correct view rotation will give you the effect you're looking for. This also circumvents the problem of the y-axis being read as a metric variable, rather than an ordinal one.

# scatter plot with categorical y-axis
def plot_3d(df, color):
    ax.scatter(df.x, df.y, df.z, c=color) # dims must be floats

# reshape with melt(), then plot
plot_3d(pd.melt(sample4, id_vars='x', var_name='y', value_name='z'), 'red')
plot_3d(pd.melt(test5, id_vars='x', var_name='y', value_name='z'), 'blue')
plot_3d(pd.melt(test6, id_vars='x', var_name='y', value_name='z'), 'green')

# label axes
ax.set_xlabel('x', fontsize=20)
ax.set_ylabel('y', fontsize=20)
ax.set_zlabel('z', fontsize=20)

# optional view configurations
ax.elev = 10
ax.azim = 280

3d scatter plot

Note: It's possible to use the bar3d class to treat one or more dimensions as categorical, but its cascading approach to multiple points with the same category value may not get you what you're looking for.

Community
  • 1
  • 1
andrew_reece
  • 20,390
  • 3
  • 33
  • 58
  • Wow man! Thanks a lot! Could it be possible to separate the lines in the y direction? For example, now red lines are continuous along the y axis and I would like to separate them depending on the y axis values. – Maxwell's Daemon Apr 26 '17 at 05:21
  • See my updated answer. TL;DR - it's sort of possible, but it might not be the best approach to data viz here. – andrew_reece Apr 26 '17 at 07:40