2

I am currently working with Q learning and I have a dictionary Q[state, action] where each state can be anything i.e. string, number, list.. depending on the application. Each state has either 3 or 4 possible actions. For each state I need to find the action which has the highest Q value. The problem is I do not know how to access all possible actions that state has directly from the dictionary which has two keys, so I have tried to use for loop:

for statex, actionx in self.array:
    if statex == state and (actionx != None):
         y[actionx] = self.array[statex, actionx]
y.argMax()

Where argMax()

def argMax(self):
    """
    Returns the key with the highest value.
    """
    if len(self.keys()) == 0: return None
    all = self.items()
    values = [x[1] for x in all]
    maxIndex = values.index(max(values))
    return all[maxIndex][0]

The problem is that it takes too long to calculate. Any ideas how I can make it faster, probably by eliminating for loop?

Juan Leni
  • 6,982
  • 5
  • 55
  • 87
  • Try using `iter(self.array)` or `iter(self.items())` – rassa45 Oct 31 '15 at 17:03
  • Get the list of tuples with key as first element and value as second element and use an iterator on it. Normally, that works faster memory-wise – rassa45 Oct 31 '15 at 17:32

1 Answers1

0

It will be much faster if you use a dictionary of dictionaries:

    self.array[state][action]
Juan Leni
  • 6,982
  • 5
  • 55
  • 87