What would be the fastest way to check for existing, and prevent adding duplicates to such list:
{'contract_id': 514750, 'value_at': datetime.datetime(2021, 12, 4, 21, 59, 13, 586785), 'value': Decimal('42362.34'), 'metric': 'LastPrice'}
{'contract_id': 514750, 'value_at': datetime.datetime(2021, 12, 4, 21, 59, 17, 827149), 'value': Decimal('42362.35'), 'metric': 'LastPrice'}
{'contract_id': 514750, 'value_at': datetime.datetime(2021, 12, 4, 21, 59, 24, 291007), 'value': Decimal('42362.35'), 'metric': 'LastPrice'}
{'contract_id': 514750, 'value_at': datetime.datetime(2021, 12, 4, 21, 59, 33, 767991), 'value': Decimal('42362.45'), 'metric': 'LastPrice'}
{'contract_id': 514750, 'value_at': datetime.datetime(2021, 12, 4, 21, 59, 35, 880753), 'value': Decimal('42362.60'), 'metric': 'LastPrice'}
{'contract_id': 514750, 'value_at': datetime.datetime(2021, 12, 4, 21, 59, 40, 135887), 'value': Decimal('42362.60'), 'metric': 'LastPrice'}
{'contract_id': 514750, 'value_at': datetime.datetime(2021, 12, 4, 21, 59, 44, 481802), 'value': Decimal('42362.75'), 'metric': 'LastPrice'}
{'contract_id': 514750, 'value_at': datetime.datetime(2021, 12, 4, 21, 59, 46, 618369), 'value': Decimal('42362.95'), 'metric': 'LastPrice'}
{'contract_id': 514750, 'value_at': datetime.datetime(2021, 12, 4, 21, 59, 50, 894044), 'value': Decimal('42362.98'), 'metric': 'LastPrice'}
I was thinking about only checking the last 25 items for existing dictionaries in the list, since it will be updated every second (or faster). Due to DB writing lag I may select not all what is registered within a iteration, so I need to 'rollback' a few seconds (which may result in duplicates). Running, each iteration, through a whole list (which may contain every third of a second registered since midnight) may be a performance killer. Or are there more sophisticated methods?
I have read that messing around with appending to dataframes is slow (and calling the drop_duplicates method every time will not be a solution either I think), so delivering a list would be the best for a dataframe (constructor). Therefore, a list would be better I think. Here are some benchmarks/examples mentioned: Remove duplicate dict in list in Python python remove duplicate dictionaries from a list But these often target the list as a whole (not a part of it).