0

I want to map a timestamp t and an identifier id to a certain state of an object. I can do so by mapping a tuple (t,id) -> state_of_id_in_t. I can use this mapping to access one specific (t,id) combination.

However, sometimes I want to know all states (with matching timestamps t) of a specific id (i.e. id -> a set of (t, state_of_id_in_t)) and sometimes all states (with matching identifiers id) of a specific timestamp t (i.e. t -> a set of (id, state_of_id_in_t)). The problem is that I can't just put all of these in a single large matrix and do linear search based on what I want. The amount of (t,id) tuples for which I have states is very large (1m +) and very sparse (some timestamps have many states, others none etc.). How can I make such a dict, which can deal with accessing its contents by partial keys?

I created two distinct dicts dict_by_time an dict_by_id, which are dicts of dicts. dict_by_time maps a timestamp t to a dict of ids, which each point to a state. Similiarly, dict_by_id maps an id to a dict of timestamps, which each point to a state. This way I can access a state or a set of states however I like. Notice that the 'leafs' of both dicts (dict_by_time an dict_by_id) point to the same objects, so its just the way I access the states that's different, the states themselves however are the same python objects.

dict_by_time = {'t_1': {'id_1': 'some_state_object_1',
                        'id_2': 'some_state_object_2'},
                't_2': {'id_1': 'some_state_object_3',
                        'id_2': 'some_state_object_4'}


dict_by_id = {'id_1': {'t_1': 'some_state_object_1',
                       't_2': 'some_state_object_3'},
              'id_2': {'t_1': 'some_state_object_2',
                       't_2': 'some_state_object_4'}

Again, notice the leafs are shared across both dicts.

I don't think it is good to do it using two dicts, simply because maintaining both of them when adding new timestamps or identifiers result in double work and could easily lead to inconsistencies when I do something wrong. Is there a better way to solve this? Complexity is very important, which is why I can't just do manual searching and need to use some sort of HashMap magic.

Aydo
  • 415
  • 4
  • 12
  • 1
    Sounds like you need a database. This would be basic SQL. – ibonyun Nov 07 '19 at 05:53
  • Yes, I am basically looking for a way to 'index' two columns. Unfortunately, SQL is out of scope. These calculations are done in memory for real-time visualization. – Aydo Nov 07 '19 at 06:03

1 Answers1

1

You can always trade add complexity with lookup complexity. Instead of using a single dict, you can create a Class with an add method and a lookup method. Internally, you can keep track of the data using 3 different dictionaries. One uses the (t,id) tuple as key, one uses t as the key and one uses id as the key. Depending on the arguments given to lookup, you can return the result from one of the dictionaries.

hevangel
  • 95
  • 5