I currently have seismic data with 175x events with 3 traces for each event (traces are numpy arrays of seismic data). I have classification labels for whether the seismic data is an earthquake or not for each of those 175 samples. I'm looking to format my data into numpy arrays for modelling. I've tried placing into a dataframe of numpy arrays with each column being a different trace. So columns would be 'Trace one' 'Trace two' 'Trace three'. This did not work. I have tried lots of different methods of arranging the data to use with keras.
I'm now looking to create a numpy matrix for the data to go into and to then use for modelling.
I had thought that the shape may be (175,3,7501)
as (#number of events, #number of traces,#number of samples in trace)
, however I then iterate through and try to add the three traces to the numpy matrix and have failed. I'm used to using dataframes and not numpy for inputting to Keras.
newrow = np.array([[trace_copy_1],[trace_copy_2],[trace_copy_3]])
data = numpy.vstack([data, newrow])
The data
shape is (175,3,7510)
. The newrow
shape is (3,1,7510)
and does not allow me to add newrow
to data
.
The form in which I receive the data is in obspy streams and each stream has the 3 trace objects. With each trace object, it holds the trace data in numpy arrays and so I'm having to access and append those to a dataframe for modelling as obviously I can't feed a stream or trace object to keras model.