4

I am new to matplotlib, and I want to create a plot, with the following information:

  1. A line joining the medians of around 200 variable length vectors (input)
  2. A line joining the corresponding quantiles of these vectors.
  3. A line joining the corresponding spread (largest and smallest points).

So basically, its somewhat like a continuous box plot.

Thanks!

Sahil M
  • 1,790
  • 1
  • 16
  • 31
  • 1
    Just to be clear, you question got put on hold because it reads as 'please do my work for me' or 'please do some free dev work for me'. You will get much better responses here if you show you have tried _anything_. People are happy to help you _fix_ your code, but are less likely to write code for you. – tacaswell Aug 19 '13 at 15:44
  • The question is well formed in the terminology of the libraries that are tagged, which means that @user2696275 demonstrates a minimal understanding of the problem being solved. – Viktor Kerkez Aug 19 '13 at 15:45
  • But I completely agree with @tcaswell. – Viktor Kerkez Aug 19 '13 at 15:46
  • @ViktorKerkez "... Include attempted solutions, why they didn't work, and the expected results. ...". There is a place here for discovering cool but badly documented things in libraries on SO, but this question (as it stands) shows no indication that the OP did more than re-format and paste their HW or bosses requirements into the question, which is disrespectful to the people who answer questions. – tacaswell Aug 19 '13 at 15:54
  • @tcaswell OK got it. But since he already got the answer it doesn't make sense to close the question. But I'll take care more about that in the future. Thank you for your comments. :) – Viktor Kerkez Aug 19 '13 at 16:00
  • @ViktorKerkez It's worth closing bad questions as it puts them on the path to deletion. – tacaswell Aug 19 '13 at 16:23
  • I understand, I am totally new to Stack Overflow, and this is my first question. So I will keep that in mind next. I could have posted a snippet, but I was just trying to make the question text clearer. Also, I should have tagged numpy/scipy to this. – Sahil M Aug 20 '13 at 08:10

1 Answers1

15

Using just scipy and matplotlib (you tagged only those libraries in your question) is a little bit verbose, but here's how you would do it (I'm doing it only for the quantiles):

import numpy as np
from scipy.stats import mstats
import matplotlib.pyplot as plt

# Create 10 columns with 100 rows of random data
rd = np.random.randn(100, 10)
# Calculate the quantiles column wise
quantiles = mstats.mquantiles(rd, axis=0)
# Plot it
labels = ['25%', '50%', '75%']
for i, q in enumerate(quantiles):
    plt.plot(q, label=labels[i])
plt.legend()

Which gives you:

enter image description here

Now, I would try to convince you to try the Pandas library :)

import numpy as np
import pandas as pd
# Create random data
rd = pd.DataFrame(np.random.randn(100, 10))
# Calculate all the desired values
df = pd.DataFrame({'mean': rd.mean(), 'median': rd.median(),
                   '25%': rd.quantile(0.25), '50%': rd.quantile(0.5),
                   '75%': rd.quantile(0.75)})
# And plot it
df.plot()

You'll get:

enter image description here

Or you can get all the stats in just one line:

rd.describe().T.drop('count', axis=1).plot()

enter image description here

Note: I dropped the count since it's not a part of the "5 number summary".

Viktor Kerkez
  • 45,070
  • 12
  • 104
  • 85
  • I would have done something similar and I like that you point out how great pandas is, but you should spend a word on why you only imported numpy and pandas instead of "scipy and matplotlib" as requested. – Dr. Jan-Philip Gehrcke Aug 19 '13 at 12:34
  • I really like pandas and I want to seduce more people into using it :) But you're right, I added the `scipy` and `matplotlib` only example. – Viktor Kerkez Aug 19 '13 at 12:52
  • Oh, I was unclear obviously, sorry :-) -- I like your choice and also really support it. It's fine now, but I just wanted you to spend some words on convincing the questioner and telling him that you selected pandas/numpy on purpose. – Dr. Jan-Philip Gehrcke Aug 19 '13 at 13:35
  • Thanks @ViktorKerkez for the answer and intro to pandas! – Sahil M Aug 20 '13 at 08:12