0

(All references to code can be found at https://github.com/EXJUSTICE/Doom_DQN_GC/blob/master/TF2_Doom_GC_CNN.ipynb)

Background

I apologize for the length of this post, I wanted it to be as clear as possible.

I've been adapting some Atari OpenAI gym code of mine to work on the VizDoom package to build a DQN model in Doom, using 480 x640 input frame images. While running some initial demos (cell marked 8) with a completely random policy, I noticed that the model would always skip the first episode. If this happens, calling a state would return a None object. As a result, I adapted my original code to check if an episode was done, before performing any training (cell marked 24)

Stacking

One of the general approaches in building agents for reinforcement learning is the use of stacked frames. The idea here is to track motion by summing up the element-wise maxima of a few consecutive frames. This is shown below for reference:

stacked_frames  =  deque([np.zeros((84,84), dtype=np.int) for i in range(stack_size)], maxlen=4)

def stack_frames(stacked_frames, state, is_new_episode):
    # Preprocess frame
    frame = preprocess_observation(state)

    if is_new_episode:
        # Clear our stacked_frames
        stacked_frames = deque([np.zeros((84,84), dtype=np.int) for i in range(stack_size)], maxlen=4)

        # Because we're in a new episode, copy the same frame 4x, apply elementwise maxima

        stacked_frames.append(frame)
        stacked_frames.append(frame)
        stacked_frames.append(frame)
        stacked_frames.append(frame)



        # Stack the frames
        stacked_state = np.stack(stacked_frames, axis=2)

    else:
        #Since deque append adds t right, we can fetch rightmost element
        #maxframe=np.maximum(stacked_frames[-1],frame)
        # Append frame to deque, automatically removes the oldest frame
        stacked_frames.append(frame)

        # Build the stacked state (first dimension specifies different frames)
        stacked_state = np.stack(stacked_frames, axis=2) 

    return stacked_state, stacked_frames

Problem

All of my attempts to run my agent resulted in the following error:

/usr/local/lib/python3.6/dist-packages/skimage/transform/_warps.py in warp(image, inverse_map, map_args, output_shape, order, mode, cval, clip, preserve_range)
    805 
    806     if image.size == 0:
--> 807         raise ValueError("Cannot warp empty image with dimensions", image.shape)
    808 
    809     image = convert_to_float(image, preserve_range)

ValueError: ('Cannot warp empty image with dimensions', (0, 24))

Upon close inspection, this error is due to the preprocessing reshaping function, which invokes scikit-image's transform in order to convert a cropped greyscale frame to the input shape (84,84). In my original OpenAI code, I called on the .reshape() function instead of .transform, but I this gave me errors with the Vizdoom frames, hence I stuck with transform.

def preprocess_observation(frame):

    # Crop and resize the image into a square, as we don't need the excess information
    cropped = frame[60:-60,30:-30]

    normalized = cropped/255.0

    img_gray = rgb2gray(normalized)

    preprocessed_frame = transform.resize(img_gray, [84,84])

    return preprocessed_frame

As it seemed that the method was trying to modify an empty image (I believe), I naturally inspected the done section of the agent.

          next_obs=np.zeros((84,84), dtype=np.int)
          next_obs,stacked_frames= stack_frames(stacked_frames,next_obs,False)

          exp_buffer.append([obs, action, next_obs, reward, done])

          step = max_steps
          history.append(episodic_reward)
          print('Episode: {}'.format(len(history)),
                          'Total reward: {}'.format(episodic_reward))

          game.new_episode()

I believe it was the stacking function that gave me the problem. I hence did some experimentation to confirm this.

Solutions Attempted

  1. Shifting the stacking function and attempting invoke an observation from the environment itself results in a Nonetype error, which is understandable given that the environment is dead.

  2. Increasing the dequeue memory buffer length seems to allow for slightly longer training time (from 5 to 10 episodes)

3.If one removes all stacking from the done section, some training does take place.

      step = max_steps
      history.append(episodic_reward)
      print('Episode: {}'.format(len(history)),
                      'Total reward: {}'.format(episodic_reward))

      game.new_episode()

This results in around 10 episodes of training, before we observe the series of errors in cell 24 (shortened below).

ValueError                                Traceback (most recent call last)
ValueError: setting an array element with a sequence.

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
<ipython-input-24-fa4adb5665e3> in <module>()
    158 
    159                 # merge all summaries and write to the file
--> 160                 mrg_summary = merge_summary.eval(feed_dict={X:o_obs, y:np.expand_dims(y_batch, axis=-1), X_action:o_act, in_training_mode:False})
    161                 file_writer.add_summary(mrg_summary, global_step)
    162 

4 frames
/usr/local/lib/python3.6/dist-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
     83 
     84     """
---> 85     return array(a, dtype, copy=False, order=order)
     86 
     87 

ValueError: setting an array element with a sequence.

Doing some searching on StackOverflow leads me to believe that the input observation array being fed to the model is somehow out of shape through the sampling process. This leads me to believe that it is the lack of stacking that is the problem.

Any advice is welcome to solve this headache. Thank you for your time!

Y. Xu
  • 11
  • 3

1 Answers1

0

it's because of the length of lists and it can't have a shape for example:

np.array( [ [1,2,3],[1,2,3,4],[1,2,3],[1,2] ] ) #this just returns an np array object without shape and this may be your problem. if you print the shape you'll get (4,)
np.array( [ [1,2,3],[1,2,3,4],[1,2,3],[1,2] ], dtype='float32') #if you run this you'll get "setting an array element with a sequence" error
np.array( [ [1,2,3],[1,2,3],[1,2,3],[1,2,3] ], dtype='float32') #this will work correctly and if your print it's shape it'll be (4,3)
Ali
  • 112
  • 1
  • 3
  • 1
    can you elaborate? I understand the theory, but im not sure where in the code am I triggering this. – Y. Xu Apr 01 '20 at 14:06