0

I have two samples of cookies. Sample 1 has 51 cookies and sample 2 has 47 cookies. They have different mass distributions, D1 and D2, and I have fit gaussians to them via python. The plot below shows the samples' mass distributions, as well as the weighted distribution which is defined as D2/D1.

I'm interested in creating a subsample of Sample 1, which takes into account the weighted distribution, to allow a more fair comparison between the two samples due to their varying masses. The Gaussians that I've fit have 1000 points in them, and thus the weighted distribution is a different size than the sample arrays.

How would one go about doing this in python?

Image of mass distributions of two samples of cookies

Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
Kelly Ford
  • 13
  • 5
  • Is this using native Python, or are you using libraries such as numpy, scipy or pandas etc...? – Jon Clements Sep 03 '16 at 20:48
  • I dont use pandas, but i do use numpy and scipy. I fit the Gaussians using Model from "lmfit" – Kelly Ford Sep 03 '16 at 20:53
  • Have you looked at [numpy.random.choice](http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.choice.html) and using `p` to provide weights? – Jon Clements Sep 03 '16 at 20:55
  • One issue is that Gaussians I've fit have 1000 points, so the weighted distribution is a different size than the sample arrays (51 and 47). Not sure the appropriate way to account for this – Kelly Ford Sep 03 '16 at 21:00
  • Right... make sure to [edit] your question with that necessary information... – Jon Clements Sep 03 '16 at 21:04

0 Answers0