I got a waveform object, define as following:
class wfm:
"""Class defining a waveform characterized by:
- A name
- An electrode configuration
- An amplitude (mA)
- A pulse width (microseconds)"""
def __init__(self, name, config, amp, width=300):
self.name = name
self.config = config
self.amp = amp
self.width = width
def __eq__(self, other):
return type(other) is self.__class__ and other.name == self.name and other.config == self.config and other.amp == self.amp and other.width == self.width
def __ne__(self, other):
return not self.__eq__(other)
Through parsing, I get a list called waveforms with 770 instance of wfm in it. There is a lot of duplicate, and I need to delete them.
My idea was to get the ID of equivalent object, store the largest ID in a list, and then loop on all the waveforms from the end while popping out each duplicate.
Code:
duplicate_ID = []
for i in range(len(waveforms)):
for j in range(i+1, len(waveforms)):
if waveforms[i] == waveforms[j]:
duplicate_ID.append(waveforms.index(waveforms[j]))
print ('{} eq {}'.format(i, j))
duplicate_ID = list(set(duplicate_ID)) # If I don't do that; 17k IDs
Turns out (thx to the print) that I have duplicates that don't appear into the ID list, for instance 750 is a duplicate of 763 (print says it; test too) and yet none of this 2 IDs appears in my duplicate list.
I'm quite sure there is a better solution that this method (which doesn't yet work), and I would be glad to hear it. Thanks for the help!
EDIT: More complicated scenario
I've got a more complicated scenario. I got 2 classes, wfm (see above) and stim:
class stim:
"""Class defining the waveform used for a stimultion by:
- Duration (milliseconds)
- Frequence Hz)
- Pattern of waveforms"""
def __init__(self, dur, f, pattern):
self.duration = dur
self.fq = f
self.pattern = pattern
def __eq__(self, other):
return type(other) is self.__class__ and other.duration == self.duration and other.fq == self.fq and other.pattern == self.pattern
def __ne__(self, other):
return not self.__eq__(other)
I parse my files to fill a dict: paradigm. It looks like that:
paradigm[file name STR] = (list of waveforms, list of stimulations)
# example:
paradigm('myfile.xml') = ([wfm1, ..., wfm10], [stim1, ..., stim5])
Once again, I want to delete the duplicates, i.e. I want to keep only the data where:
- Waveforms are the same
- And stim is the same
Example:
file1 has 10 waveforms and file2 has the same 10 waveforms.
file1 has stim1 and stim2 ; file2 has stim3, sitm 4 and stim 5.
stim1 and stim3 are the same; so since the waveforms are also the same, I want to keep:
file1: 10 waveforms and stim1 and stim2
file2: 10 waveforms and stim 4 and stim5
The correlation is kinda messy in my head, so I got a few difficulties finding the right storage solution for wave forms and stimulation in order to compare them easly. If you got any idea, I'd be glad to hear it. Thanks!