2

I am trying to import a text file (.xyz), this file looks something like this:

1 9 1 6 "Thu Feb 13 13:12:30 2014     "
0 0 0 0 0 0
38 38 915 915
"CJE                                                                              "
"2                                      "
"110321-025-01D-1ST                    
0 0 1 .1 73.7972 17 50
1 0 7 1 60 0 0 0 0
0 "                           "
1 0
#
38 38 No Data
39 38 No Data
40 38 No Data
41 38 3
42 38 No Data
43 38 4
44 38 4
45 38 5
#

the text file has a header (the first 11 lines) which contains some numerical values as shown below, also the data is separated in three columns, one of these columns, has numerical values, but also the written characters: "No Data". I also would like to change that "No Data" for the numerical value 0.

I can skip the Header, but the main Problem that I have it to tell the Code that there are three columns and that where there is "no data" that means 0. this is what I used up to now,

import numpy as np
data = np.genfromtxt('180228_Test V2-4_0grad.xyz',
                 skip_header=11,
                 skip_footer=1,
                 names=True,

                 dtype=None,
                 delimiter=' ')
print(data)
nekomatic
  • 5,988
  • 1
  • 20
  • 27
Lpng
  • 109
  • 1
  • 8

2 Answers2

2

You could add invalid_raise = False to skip the offending lines or usecols=np.arange(0, 3), however I would go with the following approach:

list.txt:

1 9 1 6 "Thu Feb 13 13:12:30 2014     "
0 0 0 0 0 0
38 38 915 915
"CJE                                                                              "
"2                                      "
"110321-025-01D-1ST                    
0 0 1 .1 73.7972 17 50
1 0 7 1 60 0 0 0 0
0 "                           "
1 0
#
38 38 No Data
39 38 No Data
40 38 No Data
41 38 3
42 38 No Data
43 38 4
44 38 4
45 38 5

and then:

logFile = "list.txt"

# opening the file
with open(logFile) as f:

    #reading the lines after slicing it i.e. 11
    content = f.readlines()[11:]

# you may also want to remove empty lines
content = [l.strip() for l in content if l.strip()]

# for each line in content
for line in content:

     # if the line has No Data in it
     if line.find("No Data"):

         # Replacing the No Data with 0 using replace() method
         line = line.replace("No Data", "0")
     print(line)

OUTPUT:

38 38 0
39 38 0
40 38 0
41 38 3
42 38 0
43 38 4
44 38 4
45 38 5

EDIT:

to add them in a 3 column matrix:

_list = []
# for each line in content
for line in content:

     # if the line has No Data in it
     if line.find("No Data"):

         # Replacing the No Data with 0 using replace() method
         line = line.replace("No Data", "0")
     # print(line)
     # list comprehension for splitting on the basis of space and appending to the list
     _list.append([e for e in line.split(' ') if e])

print(_list)

OUTPUT:

[['38', '38', '0'], ['39', '38', '0'], ['40', '38', '0'], ['41', '38', '3'],
 ['42', '38', '0'], ['43', '38', '4'], ['44', '38', '4'], ['45', '38', '5']]

EDIT 2:

to remove the last line in your file you can use slicing content[:-1]::

logFile = "list.txt"

# opening the file
with open(logFile) as f:

    #reading the lines after slicing it i.e. 11
    content = f.readlines()[11:]

_list = []
# for each line in content
for line in content[:-1]:

     # if the line has No Data in it
     if line.find("No Data"):
         # Replacing the No Data with 0 using replace() method
         line = line.replace("No Data", "0")
     # list comprehension for splitting on the basis of space and appending to the list
     _list.append([e for e in line.strip().split(' ') if e])


print(_list)

OUTPUT:

[['38', '38', '0'], ['39', '38', '0'], ['40', '38', '0'], ['41', '38', '3'],
 ['42', '38', '0'], ['43', '38', '4'], ['44', '38', '4'], ['45', '38', '5']]
DirtyBit
  • 16,613
  • 4
  • 34
  • 55
  • Thank you very much, I´m Kind of new in python, therefore could you explain me what each part od the cod do? And, also how could I do to get These data stored in a 3-column-matrix? – Lpng Feb 06 '19 at 12:57
  • @Lpng Sure, I'll add some more comments in the code. – DirtyBit Feb 06 '19 at 12:59
  • @Lpng you may accept this answer by clicking on the tick mark beside the answer if this helped, thank you! – DirtyBit Feb 07 '19 at 05:48
  • 1
    sorry I asked it again because I didn't know if you would be available Right now, so I got two Problems, the proam says that '_list' is not defined and also at the end of the file there is a row which is onle this Symbol # and I dont know how to get rid of it – Lpng Feb 07 '19 at 09:54
  • @Lpng you should have come back here and asked for it, I have added those in my second edit. PS. if it all works, you may accept the answer, cheers! – DirtyBit Feb 07 '19 at 11:19
0

Here is a different approach. Firstly, all lines are read and each line is put into an element of a list. This is all done by readlines(). Then, disregard the first 11 sentences. Then, for each line in the list of lines, replace "No Data" with a 0. Then, glue all lines together to form a single string. A numpy array is made from this string and reshaped into the correct format

import numpy as np

#Open the file and read the lines as a list of lines
with open('/home/we4sea/PycharmProjects/Noonreport-processing/GUI/test.txt','r') as f:
    file = f.readlines()

#Skip the first 11 lines
file = file[11:]

#Create new list where the replaced lines are placed
replaced = []

#Replace "No Data" with 0
for line in file:
    replaced.append(line.replace('No Data', '0'))

#Concatenate list to a single string
file = ''.join(replaced)

#Create numpy array from it and reshape to the correct format
data = np.fromstring(file, sep=' ').reshape(-1,3)

#Print the data
print(data)

Output:

[[38. 38.  0.]
 [39. 38.  0.]
 [40. 38.  0.]
 [41. 38.  3.]
 [42. 38.  0.]
 [43. 38.  4.]
 [44. 38.  4.]
 [45. 38.  5.]]
  • Hi, thank you for your answe, I only have one more Problem, the last row is an #, sorry that i didn't put in the Question before, but now it is. ANd I Need to get rid of that row before reshaping because otherwise it doesn't work, how could I do that? – Lpng Feb 07 '19 at 10:29
  • When removing the first 11 lines, you could also remove the last row using a similar operation. After the skip first 11 lines part add: file = file[:-1] – Christiaan van Vliet Feb 08 '19 at 08:21