12

Given a sprite sheet like this:

Sprite Sheet Example

I would like to write an algorithm that can loop through the pixel data and determine the bounding rectangle of each discreet sprite.

If we assume that for each pixel X, Y that I can pull either true (pixel is not totally transparent) or false (pixel is totally transparent), how would I go about automatically generating the bounding rectangles for each sprite?

The resulting data should be an array of rectangle objects with {x, y, width, height}.

Here's the same image but with the bounds of the first four sprites marked in light blue:

Sprite Sheet With Bounds

Can anyone give a step-by-step on how to detect these bounds as described above?

nathancy
  • 42,661
  • 14
  • 115
  • 137
Rob Evans
  • 6,750
  • 4
  • 39
  • 56
  • Are sprites always fully connected? – Nick Johnson Nov 27 '12 at 16:30
  • Not always, but the important bit is that there will be a straight line of transparency between them. – Rob Evans Nov 27 '12 at 17:45
  • 3
    For anyone still searching for an algorithm to cut sprites from a sprite sheet automatically, I'd like to point out this paper: https://pdfs.semanticscholar.org/9f78/4d991f5902c84c2181c6c573661abdc228b1.pdf It uses a Blob Detection Algorithm and is often successful in detecting unconnected sprites (e.g. explosions with particles flying in all directions). An example implementation source code is on GitHub: https://github.com/marcelomesmo/MuSSE – Martin Sep 10 '19 at 08:52

3 Answers3

9

Here's an approach

  • Convert image to grayscale
  • Otsu's threshold to obtain binary image
  • Perform morphological transformations to smooth image
  • Find contours
  • Iterate through contours to draw bounding rectangle and extract ROI

After converting to grayscale, we Otsu's threshold to obtain a binary image

enter image description here

Next we perform morphological transformations to merge each sprite into a single contour

enter image description here

From here we find contours, iterate through each contour, draw the bounding rectangle, and extract each ROI. Here's the result

enter image description here

and here's each saved sprite ROI

enter image description here

I've implemented this method using OpenCV and Python but you can adapt the strategy to any language

import cv2

image = cv2.imread('1.jpg')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel, iterations=2)
dilate = cv2.dilate(close, kernel, iterations=1)

cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

sprite_number = 0
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    ROI = image[y:y+h, x:x+w]
    cv2.imwrite('sprite_{}.png'.format(sprite_number), ROI)
    cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
    sprite_number += 1

cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.imshow('image', image)
cv2.waitKey()
chriscauley
  • 19,015
  • 9
  • 33
  • 33
nathancy
  • 42,661
  • 14
  • 115
  • 137
3

How about this? The only downside is that you'll need a writable version of your image to mark visited pixels in, or the floodfill will never terminate.

Process each* scan line in turn
  For each scanline, walk from left to right, until you find a non-transparent pixel P.
    If the location of P is already inside a known bounded box
      Continue to the right of the bounded box
    Else
      BBox = ExploreBoundedBox(P)
      Add BBox to the collection of known bounded boxes

Function ExploreBoundedBox(pStart)
  Q = new Queue(pStart)
  B = new BoundingBox(pStart)

  While Q is not empty
    Dequeue the front element as P
    Expand B to include P

    For each of the four neighbouring pixels N
      If N is not transparent and N is not marked
        Mark N
        Enqueue N at the back of Q

  return B

You don't need to process every scanline, you could do every 10th, or every 30th scanline. As long as it doesn't exceed the minimum sprite height.

Leon Bouquiet
  • 4,159
  • 3
  • 25
  • 36
  • 1
    Good answer. I have two comments: 1. If you don't have a writable version of the image, you can use a Set to keep track of the marked pixels. 2. When finding the neighboring pixels in `ExploreBoundedBox`, it might be better to get the _eight_ neighboring pixels (i.e. the diagonals too). See the pine tree in the first row; its rightmost pixel is only diagonally connected to the main body. – Kevin Nov 27 '12 at 13:41
  • Thanks @Astrotrain and Kevin. I am going to implement this and let you know the result! – Rob Evans Nov 27 '12 at 13:48
  • @Kevin: You're right, good points. But I'd suggest a Hashtable rather than a Set, because it's performance over large collections is better (and the number of pixels in a sprite is pretty large). Drawing inside the image is still the fastest, so if that's an option... – Leon Bouquiet Nov 27 '12 at 14:13
  • Great, this works perfectly... and it is fast too. ~59 milliseconds using that image above in JavaScript and using HTML canvas :) – Rob Evans Nov 27 '12 at 19:40
-1

Attach implementation on Python with Pillow:

URL: https://gist.github.com/tuaplicacionpropia/f5bd6b0f69a11141767387eb789f5093

URL

#!/usr/bin/env python
#coding:utf-8

from __future__ import print_function
from PIL import Image

class Sprite:
  def __init__(self):
    self.start_x = -1
    self.start_y = -1
    self.end_x = -1
    self.end_y = -1

  def expand (self, point):
    if (self.start_x < 0 and self.start_y < 0 and self.end_x < 0 and self.end_y < 0):
      self.start_x = point[0]
      self.start_y = point[1]
      self.end_x = point[0]
      self.end_y = point[1]
    else:
      if (point[0] < self.start_x):
        self.start_x = point[0]
      if (point[0] > self.end_x):
        self.end_x = point[0]
      if (point[1] < self.start_y):
        self.start_y = point[1]
      if (point[1] > self.end_y):
        self.end_y = point[1]

  def belongs (self, point):
    result = False
    result = True
    result = result and point[0] >= self.start_x and point[0] <= self.end_x
    result = result and point[1] >= self.start_y and point[1] <= self.end_y
    return result

  def __str__(self):
    result = ""
    result = result + "("
    result = result + str(self.start_x)
    result = result + ", "
    result = result + str(self.start_y)
    result = result + ", "
    result = result + str(self.end_x)
    result = result + ", "
    result = result + str(self.end_y)
    result = result + ")"
    return result

def loadSprite (pos, sprites):
  result = None
  for sprite in sprites:
    if sprite.belongs(pos):
      result = sprite
      break
  return result


def exploreBoundedBox (pStart, img):
  result = None
  q = []
  q.append(pStart)
  result = Sprite()
  result.expand(pStart)
  marks = []
  while (len(q) > 0):
    p = q.pop(0)
    result.expand(p)
    neighbouring = loadEightNeighbouringPixels(p, img)
    for n in neighbouring:
      if img.getpixel(n)[3] > 0 and not n in marks:
        marks.append(n)
        q.append(n)
  return result

def loadFourNeighbouringPixels (point, img):
  result = None
  result = []

  newPoint = (point[0], point[1] - 1)
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0] - 1, point[1])
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0] + 1, point[1])
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0], point[1] + 1)
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  return result

def loadEightNeighbouringPixels (point, img):
  result = None
  result = []

  newPoint = (point[0], point[1] - 1)
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0] - 1, point[1])
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0] + 1, point[1])
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0], point[1] + 1)
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0] - 1, point[1] - 1)
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0] + 1, point[1] - 1)
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0] - 1, point[1] + 1)
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0] + 1, point[1] + 1)
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  return result

im = Image.open("test2.png")
print(im.format, im.size, im.mode)
#PNG (640, 252) RGBA
#im.show()
print("width = " + str(im.width))
print("height = " + str(im.height))



sprites = []
for y in range(im.height):
  for x in range(im.width):
    pixel = im.getpixel((x, y))
    haycolor = True if pixel[3] > 0 else False
    if (haycolor):
      pos = (x, y)
      #print("(" + str(x) + ", " + str(y) + ") -> " + str(pixel))
      pixelP = pixel
      sprite = loadSprite(pos, sprites)
      if (sprite != None):
        x = sprite.end_x
      else:
        sprite = exploreBoundedBox(pos, im)
        sprites.append(sprite)

print("sprites")
print(str(sprites))
idx = 1
for sprite in sprites:
  print("sprite " + str(idx) + ". -> " + str(sprite))
  imSprite = im.crop((sprite.start_x, sprite.start_y, sprite.end_x + 1, sprite.end_y + 1))
  #imSprite.show()
  imSprite.save("sprite" + str(idx) + ".png")
  idx += 1

To avoid leaving behind small parts of the sprites, we must add the following improvement:

MINIMUM_SPRITE = 8

def firstNonSprites (sprites):
  result = None
  for sprite in sprites:
    if (sprite.end_x - sprite.start_x + 1) < MINIMUM_SPRITE or (sprite.end_y - sprite.start_y + 1) < MINIMUM_SPRITE:
      result = sprite
      break
  return result

def mergeSprites (sprite1, sprite2):
  result = None
  if (sprite1 != None and sprite2 != None):
    result = Sprite()
    result.start_x = min(sprite1.start_x, sprite2.start_x)
    result.start_y = min(sprite1.start_y, sprite2.start_y)
    result.end_x = max(sprite1.end_x, sprite2.end_x)
    result.end_y = max(sprite1.end_y, sprite2.end_y)
  return result

def findNextSprite (pivot, sprites):
  result = None
  distance = 99999999
  for sprite in sprites:
    if sprite != pivot:
      itemDistance = distanceSprites(pivot, sprite)
      if (itemDistance < distance):
        distance = itemDistance
        result = sprite
  return result

#Pitagoras
def distancePoints (point1, point2):
  result = 99999999
  if (point1 != None and point2 != None):
    a = abs(point2[0] - point1[0])
    b = abs(point2[1] - point1[1])
    result = math.sqrt(math.pow(a, 2) + math.pow(b, 2))
  return result

def distancePointSprite (point, sprite):
  result = 99999999
  if (point != None and sprite != None):
    distance = distancePoints(point, (sprite.start_x, sprite.start_y))
    if (distance < result):
      result = distance
    distance = distancePoints(point, (sprite.end_x, sprite.start_y))
    if (distance < result):
      result = distance
    distance = distancePoints(point, (sprite.start_x, sprite.end_y))
    if (distance < result):
      result = distance
    distance = distancePoints(point, (sprite.end_x, sprite.end_y))
    if (distance < result):
      result = distance
  return result


def distanceSprites (sprite1, sprite2):
  result = 99999999
  if (sprite1 != None and sprite2 != None):
    distance = distancePointSprite((sprite1.start_x, sprite1.start_y), sprite2)
    if (distance < result):
      result = distance
    distance = distancePointSprite((sprite1.end_x, sprite1.start_y), sprite2)
    if (distance < result):
      result = distance
    distance = distancePointSprite((sprite1.start_x, sprite1.end_y), sprite2)
    if (distance < result):
      result = distance
    distance = distancePointSprite((sprite1.end_x, sprite1.end_y), sprite2)
    if (distance < result):
      result = distance
  return result

def fixMergeSprites (sprites):
  result = []
  pivotNonSprite = firstNonSprites(sprites)
  while (pivotNonSprite != None):
    nextSprite = findNextSprite(pivotNonSprite, sprites)
    if nextSprite == None:
      break
    mergeSprite = mergeSprites(pivotNonSprite, nextSprite)
    sprites.remove(nextSprite)
    sprites.remove(pivotNonSprite)
    sprites.append(mergeSprite)
    pivotNonSprite = firstNonSprites(sprites)
  result = sprites
  return result

#BEFORE CROP
sprites = fixMergeSprites(sprites)

Full code:

#!/usr/bin/env python
#coding:utf-8

from __future__ import print_function
from PIL import Image
import math

#https://stackoverflow.com/questions/13584586/sprite-sheet-detect-individual-sprite-bounds-automatically?rq=1
'''
Process each* scan line in turn
  For each scanline, walk from left to right, until you find a non-transparent pixel P.
    If the location of P is already inside a known bounded box
      Continue to the right of the bounded box
    Else
      BBox = ExploreBoundedBox(P)
      Add BBox to the collection of known bounded boxes

Function ExploreBoundedBox(pStart)
  Q = new Queue(pStart)
  B = new BoundingBox(pStart)

  While Q is not empty
    Dequeue the front element as P
    Expand B to include P

    For each of the four neighbouring pixels N
      If N is not transparent and N is not marked
        Mark N
        Enqueue N at the back of Q

  return B
'''

class Sprite:
  def __init__(self):
    self.start_x = -1
    self.start_y = -1
    self.end_x = -1
    self.end_y = -1

  def expand (self, point):
    if (self.start_x < 0 and self.start_y < 0 and self.end_x < 0 and self.end_y < 0):
      self.start_x = point[0]
      self.start_y = point[1]
      self.end_x = point[0]
      self.end_y = point[1]
    else:
      if (point[0] < self.start_x):
        self.start_x = point[0]
      if (point[0] > self.end_x):
        self.end_x = point[0]
      if (point[1] < self.start_y):
        self.start_y = point[1]
      if (point[1] > self.end_y):
        self.end_y = point[1]

  def belongs (self, point):
    result = False
    result = True
    result = result and point[0] >= self.start_x and point[0] <= self.end_x
    result = result and point[1] >= self.start_y and point[1] <= self.end_y
    return result

  def __str__(self):
    result = ""
    result = result + "("
    result = result + str(self.start_x)
    result = result + ", "
    result = result + str(self.start_y)
    result = result + ", "
    result = result + str(self.end_x)
    result = result + ", "
    result = result + str(self.end_y)
    result = result + ")"
    return result

def loadSprite (pos, sprites):
  result = None
  for sprite in sprites:
    if sprite.belongs(pos):
      result = sprite
      break
  return result


'''
Function ExploreBoundedBox(pStart)
  Q = new Queue(pStart)
  B = new BoundingBox(pStart)

  While Q is not empty
    Dequeue the front element as P
    Expand B to include P

    For each of the four neighbouring pixels N
      If N is not transparent and N is not marked
        Mark N
        Enqueue N at the back of Q

  return B
'''
def exploreBoundedBox (pStart, img):
  result = None
  q = []
  q.append(pStart)
  result = Sprite()
  result.expand(pStart)
  marks = []
  while (len(q) > 0):
    p = q.pop(0)
    result.expand(p)
    neighbouring = loadEightNeighbouringPixels(p, img)
    for n in neighbouring:
      if img.getpixel(n)[3] > 0 and not n in marks:
        marks.append(n)
        q.append(n)
  return result

def loadFourNeighbouringPixels (point, img):
  result = None
  result = []

  newPoint = (point[0], point[1] - 1)
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0] - 1, point[1])
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0] + 1, point[1])
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0], point[1] + 1)
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  return result

def loadEightNeighbouringPixels (point, img):
  result = None
  result = []

  newPoint = (point[0], point[1] - 1)
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0] - 1, point[1])
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0] + 1, point[1])
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0], point[1] + 1)
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0] - 1, point[1] - 1)
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0] + 1, point[1] - 1)
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0] - 1, point[1] + 1)
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  newPoint = (point[0] + 1, point[1] + 1)
  if (newPoint[0] >= 0 and newPoint[1] >= 0 and newPoint[0] < img.width and newPoint[1] < img.height):
    result.append(newPoint)

  return result

MINIMUM_SPRITE = 8

def firstNonSprites (sprites):
  result = None
  for sprite in sprites:
    if (sprite.end_x - sprite.start_x + 1) < MINIMUM_SPRITE or (sprite.end_y - sprite.start_y + 1) < MINIMUM_SPRITE:
      result = sprite
      break
  return result

def mergeSprites (sprite1, sprite2):
  result = None
  if (sprite1 != None and sprite2 != None):
    result = Sprite()
    result.start_x = min(sprite1.start_x, sprite2.start_x)
    result.start_y = min(sprite1.start_y, sprite2.start_y)
    result.end_x = max(sprite1.end_x, sprite2.end_x)
    result.end_y = max(sprite1.end_y, sprite2.end_y)
  return result

def findNextSprite (pivot, sprites):
  result = None
  distance = 99999999
  for sprite in sprites:
    if sprite != pivot:
      itemDistance = distanceSprites(pivot, sprite)
      if (itemDistance < distance):
        distance = itemDistance
        result = sprite
  return result

#Pitagoras
def distancePoints (point1, point2):
  result = 99999999
  if (point1 != None and point2 != None):
    a = abs(point2[0] - point1[0])
    b = abs(point2[1] - point1[1])
    result = math.sqrt(math.pow(a, 2) + math.pow(b, 2))
  return result

def distancePointSprite (point, sprite):
  result = 99999999
  if (point != None and sprite != None):
    distance = distancePoints(point, (sprite.start_x, sprite.start_y))
    if (distance < result):
      result = distance
    distance = distancePoints(point, (sprite.end_x, sprite.start_y))
    if (distance < result):
      result = distance
    distance = distancePoints(point, (sprite.start_x, sprite.end_y))
    if (distance < result):
      result = distance
    distance = distancePoints(point, (sprite.end_x, sprite.end_y))
    if (distance < result):
      result = distance
  return result


def distanceSprites (sprite1, sprite2):
  result = 99999999
  if (sprite1 != None and sprite2 != None):
    distance = distancePointSprite((sprite1.start_x, sprite1.start_y), sprite2)
    if (distance < result):
      result = distance
    distance = distancePointSprite((sprite1.end_x, sprite1.start_y), sprite2)
    if (distance < result):
      result = distance
    distance = distancePointSprite((sprite1.start_x, sprite1.end_y), sprite2)
    if (distance < result):
      result = distance
    distance = distancePointSprite((sprite1.end_x, sprite1.end_y), sprite2)
    if (distance < result):
      result = distance
  return result

def fixMergeSprites (sprites):
  result = []
  pivotNonSprite = firstNonSprites(sprites)
  while (pivotNonSprite != None):
    nextSprite = findNextSprite(pivotNonSprite, sprites)
    if nextSprite == None:
      break
    mergeSprite = mergeSprites(pivotNonSprite, nextSprite)
    sprites.remove(nextSprite)
    sprites.remove(pivotNonSprite)
    sprites.append(mergeSprite)
    pivotNonSprite = firstNonSprites(sprites)
  result = sprites
  return result

im = Image.open("test.png")
print(im.format, im.size, im.mode)
#PNG (640, 252) RGBA
#im.show()
print("width = " + str(im.width))
print("height = " + str(im.height))



sprites = []
for y in range(im.height):
  for x in range(im.width):
    pixel = im.getpixel((x, y))
    haycolor = True if pixel[3] > 0 else False
    if (haycolor):
      pos = (x, y)
      #print("(" + str(x) + ", " + str(y) + ") -> " + str(pixel))
      pixelP = pixel
      sprite = loadSprite(pos, sprites)
      if (sprite != None):
        x = sprite.end_x
      else:
        sprite = exploreBoundedBox(pos, im)
        sprites.append(sprite)

sprites = fixMergeSprites(sprites)

print("sprites")
print(str(sprites))
idx = 1
for sprite in sprites:
  print("sprite " + str(idx) + ". -> " + str(sprite))
  imSprite = im.crop((sprite.start_x, sprite.start_y, sprite.end_x + 1, sprite.end_y + 1))
  #imSprite.show()
  imSprite.save("sprite" + str(idx) + ".png")
  idx += 1