Extracting your lists from your txt
file
I would first extract your lists from your text file into some sort of dictionary structure, something along the lines of:
d = {}
with open("Input2010_5a.txt", "r") as file:
counter = 0
for line in file:
date, long, lat, depth, temp, sal = line.split("\t")
line_data = []
line_data.append(float(date))
line_data.append(float(long))
line_data.append(float(lat))
line_data.append(float(depth))
line_data.append(float(temp))
line_data.append(float(sal))
d['list'+str(counter)] = line_data
counter += 1
And d
will be a dictionary looking something like this:
{'list0': [2010.36, 23.2628, 59.7768, 1.0, 4.1, 6.04],
'list1': [more, list, values, here], ...], ...
}
covariance matrix method 1: numpy
You can stack your 41 lists contained in your dictionary d
and then use np.cov
.
import numpy as np
all_ls = np.vstack(d.values())
cov_mat = np.cov(all_ls)
Which will then return your covariance matrix
Covariance matrix Method 2: pandas
:
You can also use pandas.cov
to get the same covariance matrix, if you prefer to have it in pandas
tabular format for later:
import pandas as pd
df=pd.DataFrame(d)
cov_mat = df.cov()
Minimal example
If you had a txt
file that looked like:
2010.36 23.2628 59.7768 1.0 4.1 6.04
2018.36 29.2 84 2.0 8.1 6.24
2022.36 33.8 99 3.0 16.2 6.5
The result of method 1 would give you:
array([[ 661506.97804414, 662002.706604 , 661506.6953528 ],
[ 662002.706604 , 662576.37510667, 662123.94745333],
[ 661506.6953528 , 662123.94745333, 661701.07526667]])
and method 2 would give you:
list0 list1 list2
list0 661506.978044 662002.706604 661506.695353
list1 662002.706604 662576.375107 662123.947453
list2 661506.695353 662123.947453 661701.075267