I am attempting to read in a large data file and convert it to a format that my other scripts can better handle.
Each datafile has a series of headers followed by two columns referring the relevant data points. This is then followed by another series of headers (in the same column) and by the next set of relevant data points. So for example:
I need to sort through the lines and write them to a file made up of multiple columns. So the first column for each set of data is the same (the frequency), so what I'm trying to get should look like the following:
I'm new to python and as yet have to find any even half successful way of managing this. I've tried a basic if statement:
def LoadData(filename):
Datafile = open(filename,'r')
# Define empty lists to read the values into
a1 = []
data=Datafile.readlines()
index = 1
for line in range(14,len(data)):
w=data[line].split()
if type(w[0]) == float:
a1.append(w[index])
if re.findall(r'[\w.]THz', w[0]):
index = index +1
return a1
But since I can't define a list to be multidimensional I don't know how I can progress to have it assign the next series of data values to another column. Defining a numpy array doesn't help me either since I need to know the exact dimensions to start with.
I'm certain there must be a relatively straight forward way to do this, but I've failed to find it. I'd appreciate any help!
This is the data opened with notepad as requested in a comment: