I am simulating an inventory management system for a retail shop; therefore, I have a (15,15) matrix of zeros in which states are rows and actions columns:
Q = np.matrix(np.zeros([15, 15]) )
Specifically, 0 is the minimum and 14 the maximum inventory level, states are current inventory level and actions stock orders (quantity).
Consequently, I would like to substitute zeros with "-1", where the sum of state and action > 14:
print(final_Q)
#First row, from which I can order everything (since 0 + 14 == 14)
[[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
#Second row, from which I can order max. 13 products (1 + 14 > 14)
[[0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1]]
#Third row, from which the max is 12
[[0 0 0 0 0 0 0 0 0 0 0 0 0 -1 -1]]
(...)
I tried implementing that manually, but how can I get the final matrix automatically?