I am trying to export a numpy matrix to ASCII format, but I want to add a header to it first.
My code concept is this:
- Import ASCII file as np.ndarray, say matrix A
- Take the header of A (first 6 rows). The header contains both float values and characters
- Take the rows of A that are not header (from rows 6 to last row), giving array B
- Apply some functions on B
- Save as ASCII in this form: header(A) + B
I tried the following:
Try 1:
import numpy as np
A = np.genfromtxt('......Input\chm_plot_1.txt', dtype=None, delimiter='\t')
header = A[0:6]
B = A[6:]
mat_out = np.concatenate([A,B])
np.savetxt('........out.txt', mat_out, delimiter='\t')
,but gives the error:
TypeError: Mismatch between array dtype ('|S3973') and format specifier ('%.18e')
Try 2:
import numpy as np
A = np.genfromtxt('......Input\chm_plot_1.txt', dtype=None, delimiter='\t')
header = A[0:6]
headers = np.vstack(header)
head_list = headers.tolist()
head_str = ''.join(str(v) for v in head_list)
B = A[6:]
np.savetxt('\out.txt', B, header = head_str, delimiter='\t')
,which gives the same error:
TypeError: Mismatch between array dtype ('|S3973') and format specifier ('%.18e')
Try 3:
import numpy as np
import linecache
A = np.genfromtxt('.............\Input\chm_plot_1.txt', dtype=None, delimiter='\t')
line1 = linecache.getline('.............Input\chm_plot_1.txt', 1)
line2 = linecache.getline('.............Input\chm_plot_1.txt', 2)
line3 = linecache.getline('.............Input\chm_plot_1.txt', 3)
line4 = linecache.getline('.............Input\chm_plot_1.txt', 4)
line5 = linecache.getline('.............Input\chm_plot_1.txt', 5)
line6 = linecache.getline('.............Input\chm_plot_1.txt', 6)
header2 = line1
header2 += line2
header2 += line3
header2 += line4
header2 += line5
header2 += line6
B = A[6:]
np.savetxt('........\out.txt', B , header = header2, delimiter='\t')
, which gives me the same error:
TypeError: Mismatch between array dtype ('|S3973') and format specifier ('%.18e')
The A array has the first lines like this:
print A[0:8] #starting from row 6, the rows have 100+ values, header is first 6 rows
['ncols 371' 'nrows 435' 'xllcorner 520298.0053'
'yllcorner 436731.3065' 'cellsize 1' 'NODATA_value -9999'
'16.52002 15.90161 15.96692 20.32922 20.59827 18.28137 18.83533 17.66 .......
'13.16687 17.09497 7.309204 20.83655 19.05078 17.68591 17.88464 ...... ']
Any help would be greatly appreciated! Thanks :)
Edit: I uploaded a sample from the input data (chm_plot_1.txt). The link is below: http://we.tl/mjgBe4QIRM
Edit2: Following the answer, the problem is that it inserts the "#" character at the beginning of the header lines, as in the image below below. Also, there is one supplementary line, the 7th one, that should be removed.
Edit 3: I think the error
ValueError: invalid literal for float()
is due to the different formats of data in the sample vs full files. Although both are .txt, they are arranged differently, as in the picture below.