There is a similar question to mine, but the data has a different structure and I run into errors. I have multiple .dat
files, that contain tables for different arbitrary times t=1,3,9,10,12
, etc. The tables in the different .dat
files have the same columns M_star, M_planet, separation
, and M_star
can be viewed as an index in steps of 0.5. Nevertheless, the length of the tables and the values of M_star
vary from file to file, e.g. for time t=1
I have
M_star M_planet separation
10.0 0.022 7.11
10.5 0.019 2.30
11.0 0.008 14.01
while for t=3
I have
M_star M_planet separation
9.5 0.308 1.32
10.0 0.522 4.18
10.5 0.019 3.40
11.0 0.338 0.91
11.5 0.150 1.20
What I would like to do is to load all the .dat
files into an xarray DataSet (at least I think this would be useful), so that I can access data in the columns M_planet
and separation
by providing precise values for t
and M_star
, e.g. I would like to do something like ds.sel(t=9, M_star=10.5)['M_planet']
to get the value of M_planet
at the given t
and M_star
coordinates. What I have tried sofar unsuccessfully is:
fnames = glob('table_t=*.dat')
fnames.sort()
kw = dict(delim_whitespace=True,names=['M_star', 'M_planet', 'separation'], skiprows=1)
# first I load all the tables into a list of dataframes
dfs = [pd.read_csv(fname,**kw) for fname in fnames]
# then I include as a column to each dataframe the time, all the t-entries are same within a dataframe
dfs2= [df_i.assign(t=t) for df_i, z in zip(dfs, [1,2,3,4,9,10,12])]
# I try to make an array DataSet, but I run into an error
d = xr.concat([df_i.to_xarray() for df_i in df_s2], dim='t')
The last line throws an error: t already exists as coordinate or variable name.
How can I load my .dat
files into xarray and make t
and M_star
the dimensions/coordinates? Tnx