2

I am trying to do time series data analysis on all the fracking wells in pennsylvania, and naturally a lot of these are dry wells with 0 production. I want to create the histogram of each array inside the list without zero in it, therefore the total length of each array will shrink a little bit

P = [data3P, data4P, data5P, data6P, data7P, data8P, data9P, data10P]
for i in P 
N = []
for i in data3P:
if i >0:
    N.append(i)
N

I think I should do it in a for loop, but just not sure how to do that for all the arrays in the list. Shall I use a double for loop?

Pang
  • 9,564
  • 146
  • 81
  • 122
Bowen Liu
  • 1,065
  • 1
  • 11
  • 24

2 Answers2

0

If you are dealing with large amounts of data, numpy is your friend. You can create a masked array (where the zeros are masked), and apply the regular histogram function, see this answer for an example.

Community
  • 1
  • 1
Benjamin
  • 11,560
  • 13
  • 70
  • 119
  • Thanks. I got "hist is not defined", which library shall I import to make it work. – Bowen Liu Dec 15 '15 at 05:51
  • @BowenLiu You need to use `pyplot.hist` – erip Dec 15 '15 at 13:00
  • @BowenLiu, you need to import numpy as np. See: http://docs.scipy.org/doc/numpy-1.10.0/reference/maskedarray.html. pyplot.hist is a reference to matplotlib, not the tool I was talking about. – Benjamin Dec 15 '15 at 16:18
0

I'm not 100% sure if this is what you need, but if you want to gather all the NumPy arrays datanP but without any zeros they might contain, you can do this:

[a[a!=0] for a in P]

It would help if you showed what one of those input arrays looks like, and what you'd like to get out of the processing you're trying to do.

Matt Hall
  • 7,614
  • 1
  • 23
  • 36