I am working on importing a rather large hdf5 file of illustris galaxy simulation code using h5py. I have the file displayed here if anyone wants to see - it is 1.96 GB.
https://drive.google.com/file/d/0B1Kj475OJBnuaFBIS2FhTFpvNkk/view?usp=sharing
I want to have h5py and numpy show tables of data, use numpy.sum to sum columns and output vectors, and tell the file to only extract certain files I want. This is a A 2D table the size 16X56 for each galaxy. It is a large dataset that contains data for millions of galaxies (almost 2GB of data) with over a million rows.
Along dimension size 56 of the table: each bin represented age. Summing along the dimension size 16 gives you a 1 dimensional vector of size 56 for each galaxy which represents stellar mass (in units of 1e^10 M(sun) formed within each age bin.
I am aiming to use python and h5py to:
1- use numpy to view the array of data and sum along the dimension size 16 to get the 56 vectors) displayed in python
2- have numpy eliminate the steady stellar formation rates to specifically extract galaxies that had a sudden starburst between 1 Gyr ago and 2 Gyr and then stopped - is there a way to do this? This would eliminate a huge amount of galaxies that i would have to look through. This is to relate to E+A galaxies which experienced a sudden starburst and then stopped.
The age bins that will be displayed once the vectors are summed through numpy are as follows:
in Gyr
<0.005,
0.005 - 0.015,
0.015 - 0.025,
0.025 - 0.035,
0.035 - 0.045,
0.045 - 0.055,
0.055 - 0.065,
0.065 - 0.075,
0.075 - 0.085,
0.085 - 0.095,
0.095 - 0.125,
0.125 - 0.175,
0.175 - 0.225,
0.225 - 0.275,
0.275 - 0.325,
0.325 - 0.375,
0.375 - 0.425,
0.425 - 0.475,
0.475 - 0.55,
0.55 - 0.65,
0.65 - 0.75,
0.75 - 0.85,
0.85 - 0.95,
0.95 - 1.125,
1.125 - 1.375,
1.375 - 1.625,
1.625 - 1.875,
1.875 - 2.125,
2.125 - 2.375,
2.375 - 2.625,
2.625 - 2.875,
2.875 - 3.125,
3.125 - 3.375,
3.375 - 3.625,
3.625 - 3.875,
3.875 - 4.25,
4.25 - 4.75,
4.75 - 5.25,
5.25 - 5.75,
5.75 - 6.25,
6.25 - 6.75,
6.75 - 7.25,
7.25 - 7.75,
7.75 - 8.25,
8.25 - 8.75,
8.75 - 9.25,
9.25 - 9.75,
9.75 - 10.25,
10.25 - 10.75,
10.75 - 11.25,
11.25 - 11.75,
11.75 - 12.25,
12.25 - 12.75,
12.75 - 13.25,
13.25 - 13.75, >13.75.
I know how to read the meaning of the data, but since I'm an amateur at using hdf5 files in coding and coding in general, I'm having trouble figuring out the specific commands to get h5py and numpy to sum along the dimensions I want, display the vectors, etc.
Is there anyone with experience that knows how to do this?
Thank you,
Winonah