I have prepared a dataset to recognise a certain type of objects (about 2240 negative object examples and only about 90 positive object examples). However, after calculating 10 features for each object in the dataset, the number of unique training instances dropped to about 130 and 30, respectively.
Since the identical training instances actually represent different objects, can I say that this duplication holds relevant information (e.g. the distribution of object feature values), which may be useful in one way or another?