0

I'm trying to solve a zero-sum game finding the optimal probability distribution for player I. To do so I'm using scipy linprog simplex method.

I've seen an example, I need to transform this game :

G=np.array([
[ 0  2 -3  0]
[-2  0  0  3]
[ 3  0  0 -4]
[ 0 -3  4  0]])

into this linear optimization problem:

Maximize           z
Subject to:               2*x2 - 3*x3        + z <= 0
                  -2*x1 +             + 3*x4 + z <= 0
                   3*x1 +             - 4*x4 + z <= 0
                        - 3*x2 + 4*x3        + z <= 0
with              x1 + x2 + x3 + x4 = 1

Here's my actual code:

def simplex(G):
    (n,m) = np.shape(G)

    A_ub = np.transpose(G)
    # we add an artificial variable to maximize, present in all inequalities
    A_ub = np.append(A_ub, np.ones((m,1)), axis = 1)
    # all inequalities should be inferior to 0
    b_ub = np.zeros(m)

    # the sum of all variables except the artificial one should be equal to one
    A_eq = np.ones((1,n+1))
    A_eq[0][n] = 0
    b_eq = np.ones(1)

    c = np.zeros(n + 1)
    # -1 to maximize the artificial variable we're going to add
    c[n] = -1

    res = linprog(c, A_ub=A_ub, b_ub=b_ub, A_eq=A_eq, b_eq=b_eq, bounds=(0,None))

    return (res.x[:-1], res.fun)

Here's the distribution I get : [5.87042987e-01 1.77606350e-10 2.79082859e-10 4.12957014e-01] which does sum up to 1, but I expect [0 0.6 0.4 0]

I'm trying on a larger game with 6 or 7 lines (and so variables) and it doesn't even sum up to 1.. What did I do wrong ?

Thanks for any help you could provide.

Harkan
  • 153
  • 2
  • 12

2 Answers2

2

(I suppose Player 1 (row player) is maximizing and Player 2 (column player) is minimizing.)

Player 1's strategy in a Nash equilibrium of this game is any [0, x2, x3, 0] with 4/7 <= x2 <= 3/5, x2 + x3 = 1.

In your code, you are missing a negative sign for the inequality constraint -G.T x + z <= 0. Try the following code:

def simplex(G, method='simplex'):
    (n,m) = np.shape(G)

    A_ub = -np.transpose(G)  # negative sign added
    # we add an artificial variable to maximize, present in all inequalities
    A_ub = np.append(A_ub, np.ones((m,1)), axis = 1)
    # all inequalities should be inferior to 0
    b_ub = np.zeros(m)

    # the sum of all variables except the artificial one should be equal to one
    A_eq = np.ones((1,n+1))
    A_eq[0][n] = 0
    b_eq = np.ones(1)

    c = np.zeros(n + 1)
    # -1 to maximize the artificial variable we're going to add
    c[n] = -1

    res = linprog(c, A_ub=A_ub, b_ub=b_ub, A_eq=A_eq, b_eq=b_eq, bounds=(0,None),
                  method=method)  # `method` option added

    return (res.x[:-1], res.fun)

With the simplex method:

simplex(G, method='simplex')
(array([0.        , 0.57142857, 0.42857143, 0.        ]), 0.0)
# 4/7 = 0.5714285...

With the interior point method:

simplex(G, method='interior-point')
(array([1.77606350e-10, 5.87042987e-01, 4.12957014e-01, 2.79082859e-10]),
 -9.369597151936987e-10)
# 4/7 < 5.87042987e-01 < 3/5

With the revised simplex method:

simplex(G, method='revised simplex')
(array([0. , 0.6, 0.4, 0. ]), 0.0)
# 3/5 = 0.6

(Run with SciPy v1.3.0)

oyamad
  • 21
  • 4
  • Thanks for your answer, I updated with the solution I found months ago. I remember switching to a negative sign, and using several different methods but every time the solution was right I tried it on several other examples and I was getting something completely wrong. Maybe the `revised simplex` was a good method not sure I tried it, but as I said I couldn't having it to work while trying for several hours with the documentation, so I highly recommend using a specialized, properly documented, and maybe easier-to-use for a begginer library rather than tweaking `linprog` – Harkan Aug 17 '19 at 15:46
0

I haven't updated the post since I found a solution. I recommend not using Scipy linprog function, it's badly documented if you don't know much about linear programming, and I found it to be unprecise and inconsistent on numerous examples (and I did try adding a negative sign at that time, as suggested by oyamad).

I switched to the PuLP python library, and had no problem having consistent results from the get-go.

Harkan
  • 153
  • 2
  • 12