8 puzzle using blind search (brute-force) and manhattan distance heuristic

Question

I developed my own program in Python for solving 8-puzzle. Initially I used "blind" or uninformed search (basically brute-forcing) generating and exploring all possible successors and using breadth-first search. When it finds the "goal" state, it basically back-tracks to the initial state and delivers (what I believe) is the most optimized steps to solve it. Of course, there were initial states where the search would take a lot of time and generate over 100,000 states before finding the goal.

Then I added the heuristic - Manhattan Distance. The solutions started coming exponentially quickly and with lot less explored states. But my confusion is that some of the times, the optimized sequence generated was longer than the one reached using blind or uninformed search.

What I am doing is basically this:

For each state, look for all possible moves (up, down, left and right), and generate the successor states.
Check if state is repeat. If yes, then ignore it.
Calculate Manhattan for the state.
Pick out the successor(s) with lowest Manhattan and add at the end of the list.
Check if goal state. If yes, break the loop.

I am not sure whether this would qualify as greedy-first, or A*.

My question is, is this an inherent flaw in the Manhattan Distance Heuristic that sometimes it would not give the most optimal solution or am i doing something wrong.

Below is the code. I apologize that it is not a very clean code but being mostly sequential it should be simple to understand. I also apologize for a long code - I know I need to optimize it. Would also appreciate any suggestions/guidance for cleaning up the code. Here is what it is:

import numpy as np
from copy import deepcopy
import sys

# calculate Manhattan distance for each digit as per goal
def mhd(s, g):
    m = abs(s // 3 - g // 3) + abs(s % 3 - g % 3)
    return sum(m[1:])

# assign each digit the coordinate to calculate Manhattan distance
def coor(s):
    c = np.array(range(9))
    for x, y in enumerate(s):
        c[y] = x
    return c

#################################################
def main():
    goal    =  np.array( [1, 2, 3, 4, 5, 6, 7, 8, 0] )
    rel     = np.array([-1])
    mov     = np.array([' '])
    string = '102468735'
    inf = 'B'
    pos = 0
    yes = 0
    goalc = coor(goal)
    puzzle = np.array([int(k) for k in string]).reshape(1, 9)
    rnk = np.array([mhd(coor(puzzle[0]), goalc)])
    while True:
        loc = np.where(puzzle[pos] == 0)        # locate '0' (blank) on the board
        loc = int(loc[0])
        child = np.array([], int).reshape(-1, 9)
        cmove = []
        crank = []
        # generate successors on possible moves - new states no repeats
        if loc > 2:         # if 'up' move is possible
            succ = deepcopy(puzzle[pos])
            succ[loc], succ[loc - 3] = succ[loc - 3], succ[loc]
            if ~(np.all(puzzle == succ, 1)).any():  # repeat state?
                child = np.append(child, [succ], 0)
                cmove.append('up')
                crank.append(mhd(coor(succ), goalc)) # manhattan distance
        if loc < 6:         # if 'down' move is possible
            succ = deepcopy(puzzle[pos])
            succ[loc], succ[loc + 3] = succ[loc + 3], succ[loc]
            if ~(np.all(puzzle == succ, 1)).any():  # repeat state?
                child = np.append(child, [succ], 0)
                cmove.append('down')
                crank.append(mhd(coor(succ), goalc))
        if loc % 3 != 0:    # if 'left' move is possible
            succ = deepcopy(puzzle[pos])
            succ[loc], succ[loc - 1] = succ[loc - 1], succ[loc]
            if ~(np.all(puzzle == succ, 1)).any():  # repeat state?
                child = np.append(child, [succ], 0)
                cmove.append('left')
                crank.append(mhd(coor(succ), goalc))
        if loc % 3 != 2:    # if 'right' move is possible
            succ = deepcopy(puzzle[pos])
            succ[loc], succ[loc + 1] = succ[loc + 1], succ[loc]
            if ~(np.all(puzzle == succ, 1)).any():  # repeat state?
                child = np.append(child, [succ], 0)
                cmove.append('right')
                crank.append(mhd(coor(succ), goalc))
        for s in range(len(child)):
            if (inf in 'Ii' and crank[s] == min(crank)) \
            or (inf in 'Bb'):
                puzzle = np.append(puzzle, [child[s]], 0)
                rel = np.append(rel, pos)
                mov = np.append(mov, cmove[s])
                rnk = np.append(rnk, crank[s])
                if np.array_equal(child[s], goal):
                    print()
                    print('Goal achieved!. Successors generated:', len(puzzle) - 1)
                    yes = 1
                    break
        if yes == 1:
            break
        pos += 1
    # generate optimized steps by back-tracking the steps to the initial state
    optimal = np.array([], int).reshape(-1, 9)
    last = len(puzzle) - 1
    optmov = []
    rank = []
    while last != -1:
        optimal = np.insert(optimal, 0, puzzle[last], 0)
        optmov.insert(0, mov[last])
        rank.insert(0, rnk[last])
        last = int(rel[last])

    # show optimized steps
    optimal = optimal.reshape(-1, 3, 3)
    print('Total optimized steps:', len(optimal) - 1)
    print()
    for s in range(len(optimal)):
        print('Move:', optmov[s])
        print(optimal[s])
        print('Manhattan Distance:', rank[s])
        print()
    print()

################################################################
# Main Program

if __name__ == '__main__':
    main()

Here are some of the initial states and the optimized steps calculated if you would like to check (above code would give this option to choose between blind vs Informed search)

Initial states
- 283164507 Blind: 19 Manhattan: 21
- 243780615 Blind: 15 Manhattan: 21
- 102468735 Blind: 11 Manhattan: 17
- 481520763 Blind: 13 Manhattan: 23
- 723156480 Blind: 16 Manhattan: 20

I have deliberately chosen examples where results would be quick (within seconds or few minutes).

Your help and guidance would be much appreciated.

Edit: I have made some quick changes and managed to reduce some 30+ lines. Unfortunately can't do much at this time.
Note: I have hardcoded the initial state and the blind vs informed choice. Please change the value of variable "string" for initial state and the variable "inf" [I/B] for Informed/Blind. Thanks!

I'm not sure anyone has the time to read all that code. Is there no way to reduce it? — JasTonAChair, Nov 03 '15 at 08:39
@JasTonAChair I have tried. not much success i am afraid. I think if you run it with different values and note the outcome you will get the sense of what i am talking about. maybe then you will need to see only the specific sections of code and not the whole. thanks. — Abdul Qadir, Nov 03 '15 at 09:25
That's not A*, and offhand I don't know whether it would produce optimal results but I assume not. Doing actual A* would likely give you better results. — Sneftel, Nov 03 '15 at 09:47
@JasTonAChair, done some more reduction from approx 200 lines to approx 100 lines now. maybe more readable. — Abdul Qadir, Nov 03 '15 at 09:51
@Sneftel, why do you say this is not A*, what would need to be changed. please elaborate. — Abdul Qadir, Nov 03 '15 at 09:52
This is just a greedy search, there's no guarantee that it will work. — interjay, Nov 03 '15 at 10:00
if you are referring to f(n) = g(n) + h(n) for the priority queue here is what i am considering. although i am not storing the states in tree style, the concept is that of a tree. using breadth-first, i am exploring all the states in a level before moving to the next level. each new state being generated from states at same level, will have same cost (taking cost of each move as 1), therefore, the value of g(n) for each successor generated from states at the same level will be the same and ignored as not adding any value. the only variable is h(n) which i am considering. — Abdul Qadir, Nov 03 '15 at 10:16
@Sneftel, have i managed to explain myself or should i elaborate further? please guide — Abdul Qadir, Nov 03 '15 at 11:03
The nature of A* is that a successor node (the destination of an edge of an expanded node) may be expanded soon after its parent is expanded (if it looks really promising), or may be expanded much later (if it initially looked like a bad idea, but the more promising-looking successors turned out to lead to dead ends). That is, it must be willing to expand, say, a level-2 node after expanding a level-5 node. But this is not possible if the expansion is FIFO. Put simply, breadth-first searches cannot incorporate heuristics while maintaining correctness. — Sneftel, Nov 03 '15 at 15:33
@Sneftel thank you for the explanation. it makes sense. i am just wondering in case of 8-puzzle, how to determine what is looking promising and what is a bad idea and at which point to back-track since the only point we have is the manhattan heuristic and if i look at some of the optimized solutions reached without heuristic, the heuristics do sometimes increase and then decrease. let me research it further. if you can point me to site/page where 8-puzzle specific guidance is available i would appreciate it. thank you. — Abdul Qadir, Nov 04 '15 at 05:39
There's nothing special about the 8-puzzle. All you need to understand is what it means for an A* heuristic to be "admissible". Once you get that, and you get *why* a heuristic needs to be admissible, you'll have everything you need. — Sneftel, Nov 04 '15 at 10:54

8 puzzle using blind search (brute-force) and manhattan distance heuristic

0 Answers0