0

I have a file that looks like this:

row  column  layer value1  value2      
8  454  1  0.000e+0 1.002e+4
8  455  1  0.000e+0 1.001e+4
8  456  1  0.000e+0 1.016e+4
8  457  1  0.000e+0 1.016e+4
.
.
.

I want to do some calculations on the last column (for example multiply by 10) and save it (in-place or as a new file) without changing the format. I know how to load it but I don't know how to continue. I do the following to load the data:

import numpy as np

ic = np.genfromtxt("file.dat",skip_header=1, usecols=(0,1,2,4), 
                    dtype=None, names = ['row', 'column', 'layer', 'value2'])

the file is abound 150M, so fast execution would be helpful.

Bob
  • 443
  • 7
  • 15
  • 1
    Did you try printing the result to see what is where is it? That's what Python's interactive console is for. If it's large, you can print just a few rows with `ic[:]`. – ivan_pozdeev Apr 29 '17 at 02:28

1 Answers1

1

Your example has columns indexed only 0 to 4, so usecols=(0,1,2,5) produces an error with the file in your example. Assuming usecols=(0,1,2,4):

You can modify the array in-place with

for i in range(0,len(ic)):
 ic[i]['value2'] *= 10

and save it to tab-delimited text with

np.savetxt("mul.dat", ic, fmt="%d %d %e %e", delimiter="\t", header=" ".join(ic.dtype.names))

producing

# row column layer value2
8 454 1.000000e+00 1.002000e+05
8 455 1.000000e+00 1.001000e+05
8 456 1.000000e+00 1.016000e+05
8 457 1.000000e+00 1.016000e+05

But you won't be able to write out the value1 column if your usecols(0,1,2,4) caused it never to be read in.

MassPikeMike
  • 672
  • 3
  • 12
  • It works great. I corrected the index in the main post. I am an old FORTRAN programmer, still Python 0-indexing makes me confessed. – Bob Apr 29 '17 at 02:46