0

I have 10 txt files named A_1,A_2.......A_10 and I want to compare it with a txt file named A. My goal is to find the sum of differences between values of a specific column,but the problem is both in the 10 txt files(A_1,A_2.....A_10) and in A,i have some bad values in a column which are equal to -1.00000E+31.I am stuck at how to manipulate the code so that Python skips whenever the value is equal to -1.00000E+31 and moves onto the next one, thought of using np.nan but that's not working as it's giving the total sum to be equal to nan. Any suggestion would be really helpful.

import numpy as np
filelist=[]
for i in range(1,11):
    filelist.append("/Users/Hrihaan/Desktop/A_%s.txt" %i)
for fname in filelist:
    data=np.loadtxt(fname)
    data1=np.loadtxt('/Users/Hrihaan/Desktop/A.txt')
    x=data[:,1]
    x1=data1[:,1]
    bad = np.where(data[:,1] == -1E31)
    data[bad,1] = np.nan
    bad1 = np.where(data1[:,1] == -1E31)
    data1[bad1,1] = np.nan
    x2=(x-x1)
    x3=sum(x2)
    print(fname)
    print(x3)
Hrihaan
  • 275
  • 5
  • 21
  • 1
    Since you "thought of using `np.nan` but that's not working as it's giving the total sum to be equal to `nan`" you should realize that [`numpy.nansum`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.nansum.html) exists. It returns "the sum of array elements over a given axis treating Not a Numbers (NaNs) as zero." – Steven Rumbalski Jul 28 '17 at 16:23
  • That worked and thanks a lot, you made my work a whole lot easier@StevenRumbalski, but the issue with that is,Python will still be considering nans as array elements, i may have to divide the sum of differences by the total number of values, so in that case how can i approach so that these nans are just skipped over and when a value is equal to these nans, Python doesn't count the difference and skips it, so that it is not counted as an element. – Hrihaan Jul 28 '17 at 16:55
  • Then skip `numpy.nansum` and just go straight for `numpy.nanmean`. – Steven Rumbalski Jul 28 '17 at 18:02
  • That was what I needed, Thanks again@StevenRumbalski – Hrihaan Jul 28 '17 at 18:22
  • I went ahead and posted my comments as an answer. (I was previously hesitant to do so since I'm not a numpy user. I just searched the docs.) – Steven Rumbalski Jul 28 '17 at 18:33

1 Answers1

0

Since you "thought of using np.nan but that's not working as it's giving the total sum to be equal to nan", numpy.nansum should do the job. It returns "the sum of array elements over a given axis treating Not a Numbers (NaNs) as zero."

If you need the average skip numpy.nansum and just go straight for numpy.nanmean.

Steven Rumbalski
  • 44,786
  • 9
  • 89
  • 119