0

I am trying to formulate and solve the following problem of image mutation. Suppose I am trying to insert an object image into a "background" image of several objects, and I will need to look for a "sweet spot" to insert the image:

enter image description here

I am tentatively trying to formulate the problem into a reinforcement learning process, with the following elements:

0. initial stage:

  • a background image where the location of objects within the image has been marked (let's suppose we have a perfect object detector)

  • another image of a new object, let's say, a human

1. action space:

  • location (x, y) for the object image to be inserted; in that sense the action space is quite large.

2. environment:

  • each step I will have a new image to "learn from".

  • An oracle function F returns 1 or 0 (roughly one computation of F takes 30 seconds). This function tells me the latest synthesized image hits the "sweet spot" or not (1 means hit). If so, I will stop the search and return the image.

3. constraint:

the newly inserted object shouldn't overlap with the original objects in the figure.

While my gut feeling is that this problem is somehow similar to the classic "maze escape" problem which can be solved well with reinforcement learning, the action space seems quite large in this problem.

So here are my questions:

  1. In case I would like to formulate this "beautify" image problem into a "deep" reinforcement learning problem, how can I learn from such large action space? Or is it really suitable for a reinforcement learning process?

  2. Can I somehow subsume the "non-overlapping" constraint into the oracle function F? If so, how should I decide the reward score? Any principled or empirical way of deciding so?

lllllllllllll
  • 8,519
  • 9
  • 45
  • 80

0 Answers0