2

I am trying to export a numpy matrix to ASCII format, but I want to add a header to it first.

My code concept is this:

  1. Import ASCII file as np.ndarray, say matrix A
  2. Take the header of A (first 6 rows). The header contains both float values and characters
  3. Take the rows of A that are not header (from rows 6 to last row), giving array B
  4. Apply some functions on B
  5. Save as ASCII in this form: header(A) + B

I tried the following:

Try 1:

import numpy as np
A = np.genfromtxt('......Input\chm_plot_1.txt', dtype=None, delimiter='\t')
header = A[0:6]
B = A[6:]
mat_out = np.concatenate([A,B])
np.savetxt('........out.txt', mat_out, delimiter='\t')

,but gives the error:

TypeError: Mismatch between array dtype ('|S3973') and format specifier ('%.18e')

Try 2:

import numpy as np
A = np.genfromtxt('......Input\chm_plot_1.txt', dtype=None, delimiter='\t')
header = A[0:6]
headers = np.vstack(header)
head_list = headers.tolist()
head_str = ''.join(str(v) for v in head_list)
B = A[6:]
np.savetxt('\out.txt', B, header = head_str,  delimiter='\t')

,which gives the same error:

TypeError: Mismatch between array dtype ('|S3973') and format specifier ('%.18e')

Try 3:

import numpy as np
import linecache

A = np.genfromtxt('.............\Input\chm_plot_1.txt', dtype=None, delimiter='\t')
line1 = linecache.getline('.............Input\chm_plot_1.txt', 1)
line2 = linecache.getline('.............Input\chm_plot_1.txt', 2)
line3 = linecache.getline('.............Input\chm_plot_1.txt', 3)
line4 = linecache.getline('.............Input\chm_plot_1.txt', 4)
line5 = linecache.getline('.............Input\chm_plot_1.txt', 5)
line6 = linecache.getline('.............Input\chm_plot_1.txt', 6)
header2 = line1
header2 += line2
header2 += line3
header2 += line4
header2 += line5
header2 += line6

B = A[6:]
np.savetxt('........\out.txt', B , header = header2,  delimiter='\t')

, which gives me the same error:

TypeError: Mismatch between array dtype ('|S3973') and format specifier ('%.18e')

The A array has the first lines like this:

print A[0:8] #starting from row 6, the rows have 100+ values, header is first 6 rows

['ncols         371' 'nrows         435' 'xllcorner     520298.0053'
 'yllcorner     436731.3065' 'cellsize      1' 'NODATA_value  -9999'
'16.52002 15.90161 15.96692 20.32922 20.59827 18.28137 18.83533 17.66 .......
'13.16687 17.09497 7.309204 20.83655 19.05078 17.68591 17.88464 ...... ']

Any help would be greatly appreciated! Thanks :)

Edit: I uploaded a sample from the input data (chm_plot_1.txt). The link is below: http://we.tl/mjgBe4QIRM

Edit2: Following the answer, the problem is that it inserts the "#" character at the beginning of the header lines, as in the image below below. Also, there is one supplementary line, the 7th one, that should be removed.

enter image description here

Edit 3: I think the error

ValueError: invalid literal for float()

is due to the different formats of data in the sample vs full files. Although both are .txt, they are arranged differently, as in the picture below.

chm_plot_1 chm_plot_1_sample

Litwos
  • 1,278
  • 4
  • 19
  • 44

1 Answers1

1

The problem is that your header has not the same format than data.

A way to solve that : Treat header as a normal file text, and data as numeric.

with open('chm_plot_1_sample.txt') as f : 
      header="".join([f.readline() for i in range(6)])[:-1]
a=np.loadtxt('chm_plot_1_sample.txt',delimiter='\t',skiprows=6)
a=a/2  # some treatement
np.savetxt('out.txt',a,delimiter='\t',header=header,comments='')
B. M.
  • 18,243
  • 2
  • 35
  • 54
  • This is helpful, but there's one problem. It puts the character "#" before the header lines. I edited the main post. – Litwos Jan 16 '16 at 14:53
  • That works, but I get a blank line between header and the matrix. – Litwos Jan 16 '16 at 15:00
  • That solves it, but if I try to run it on the full set of data, it gives the error: 'ValueError: invalid literal for float()' – Litwos Jan 16 '16 at 15:19
  • So, look at the end of the file, probably a footer, to manage in the same way. – B. M. Jan 16 '16 at 17:06