0

Kitti has a benchmark for Optical Flow. They require the flow estimate to be 48bit PNG files to match the format of the ground truth files they have.

Ground Truth PNG Image is available for download here

Kitti have a Matlab DevKit for the estimate versus ground truth comparison.

I want to output the flow from my network as 48 bit integer PNG files, so that my flow estimates can be compared with other Kitti benchmarked flow estimates.

The numpy scaled flow file from the network is downloadable from here

However, I'm having trouble converting the float32 3D array flow to 3 channel 48bit files (16bit per channel) in python because there doesn't seem to be the support for this among image library providers, or because I am doing something wrong with my code. Can anyone help ?

I have tried a bunch of different libraries and read lots of posts.

Scipy outputs a png that is only 24bit unfortunately. Output flow estimate png generated using scipy available here

# Numpy Flow to 48bit PNG with 16bits per channel

import scipy as sp
from scipy import misc
import numpy as np
import png
import imageio
import cv2
from PIL import Image
from matplotlib import image

"""From Kitti DevKit:-

Optical flow maps are saved as 3-channel uint16 PNG images: The first 
channel
contains the u-component, the second channel the v-component and the 
third
channel denotes if the pixel is valid or not (1 if true, 0 otherwise). To 
convert
the u-/v-flow into floating point values, convert the value to float, 
subtract 2^15 and divide the result by 64.0:"""

Scaled_Flow = np.load('Scaled_Flow.npy') # This is a 32bit float
# This is the very first Kitti Test Flow Output from image_2 testing folder  
# passed through DVF
# The network that produced this flow is only trained to 51 steps, so it 
# won't provide an accurate correspondence
# But the Estimated Flow PNG should look green

ones = np.float32(np.ones((2,375,1242,1))) # Kitti devkit readme says 
that third channel is 1 if flow is valid for that pixel
# 2 for batch size, 3 for height, 3 for width, 1 for this extra layer of 
ones.
with_ones = np.concatenate((Scaled_Flow, ones), axis=3)

im = sp.misc.toimage(with_ones[-1,:,:,:], cmin=-1.0, cmax=1.0) # saves image object
im.save("Scipy_24bit.png", dtype="uint48") # Outputs 24bit only.

Flow = np.int16(with_ones) # An attempt at converting the format from 
float 32 to 16 bit integers
f512 = Flow * 512 # Kitti instructs that the flows are scaled by 512.

x = np.array(Scaled_Flow)
x.astype(np.uint16) # another attempt at converting it to unsigned 16 bit 
integers

try: # try PyPNG
    with open('PyPNGuint48bit.png', 'wb') as f:
        writer = png.Writer(width=375, height=1242, bitdepth=16)
        # Convert z to the Python list of lists expected by
        # the png writer.
        #z2list = x.reshape(-1, x.shape[1]*x.shape[2]).tolist()
        writer.write(f, x)
except:
    print("png lib approach didn't work, it might be to do with the 
sizing")

try: # try imageio
    imageio.imwrite('imageio_Flow_48bit.png', x, format='PNG-FI')
except:
    print("imageio approach didn't work, it probably couldn't handle the 
datatype")

try: # try OpenCV
    cv2.imwrite('OpenCVFlow_48bit_.png',x )
except:
    print("OpenCV approach didn't work, it probably couldn't handle the 
datatype")

try: #try: # try PIL
    im = Image.fromarray(x)
    im.save("PILLOW_Flow_48bit.png", format="PNG")
except:
    print("PILLOW approach didn't work, it probably couldn't handle the 
datatype")

try: # try Matplotlib
    image.imsave('MatplotLib_Flow_48bit.png', x)
except:
    print("Matplotlib approach didn't work, ValueError: object too deep 
for desired array")'''

I want to get a 48bit png file the same as the Kitti Ground truth, that looks green. Currently Scipy outputs a 24bit png file that is blue and white looking.

CogT
  • 53
  • 6

1 Answers1

1

Here is my understanding of what you want to do:

  1. Load the data from Scaled_Flow.npy. This is a 32 bit floating point numpy array with shape (2, 375, 1242, 2).
  2. Convert Scaled_Flow[1] (an array with shape (375, 1242, 2)) to 16 bit unsigned integers by:

    • multiplying by 64,
    • adding 2**15, and
    • casting the values to np.uint16.

    That is the inverse of this description that you quoted: "To convert the u-/v-flow into floating point values, convert the value to float, subtract 2^15 and divide the result by 64.0".

  3. Increase the length of the third dimension from 2 to 3 by concatenating an array of all 1s.
  4. Save the result to a PNG file.

Here's one way you can do that. To create the PNG file, I'll use numpngw, a library that I wrote for creating PNG and animated PNG files from numpy arrays. If you give numpngw.write_png a numpy array with data type np.uint16, it will create a PNG file with 16 bits per channel (i.e. a 48 bit image in this case).

import numpy as np
from numpngw import write_png


Scaled_Flow = np.load('Scaled_Flow.npy')
sf16 = (64*Scaled_Flow[-1] + 2**15).astype(np.uint16)
imgdata = np.concatenate((sf16, np.ones(sf16.shape[:2] + (1,), dtype=sf16.dtype)), axis=2)

write_png('sf48.png', imgdata)

Here is the image that is created by that script.

png file

Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214
  • Thank you Warren, this works like magic. could you explain in more detail what .shape[:2] + (1,) does ? – CogT Aug 07 '19 at 19:36
  • 1
    `sf16.shape` is a tuple with the value `(375, 1242, 2)`, so `sf16.shape[:2]` is the tuple `(375, 1242)`, and then `sf16.shape[:2] + (1,)` is the tuple `(375, 1242, 1)`. That is the desired shape of the array of 1s that is appended to `sf16` to make the 3-d result with shap `(375, 1242, 3)`. – Warren Weckesser Aug 07 '19 at 23:47
  • Thanks again Warren. ps. Do you have a solution for resizing this 48bit flow to 256 x 256 pixels maintaining its 48 bits, ie without dropping to say 24bits ? – CogT Aug 09 '19 at 23:41
  • 1
    For resizing, you can look into Pillow's [`resize`](https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image.resize) method, OpenCV's `resize` function, or the [`zoom`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.zoom.html) function in `scipy.ndimage`. If you have trouble figuring out how to do it, ask a new StackOverflow question. It is not something to be worked out here in the comments. – Warren Weckesser Aug 10 '19 at 01:00
  • https://stackoverflow.com/questions/57440861/resizing-a-48bit-png-retaining-its-48bits-without-dropping-it-to-a-24bit-file – CogT Aug 10 '19 at 09:17