0

I am working on project to analyze the rotation of cells. The information is stored in a .dat file that contains an image of a cell. I have to convert it to a tiff file then concatenate all the images to have an avi video. The problem is that I didn't find documentation about dat and tiff files. and when I try to open a .dat file using Python I get this error message :

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9f in position 1: invalid start byte

The camera is Andor Neo5.5 scmos(https://andor.oxinst.com/assets/uploads/products/andor/documents/andor-neo-scmos-specifications.pdf)

and this is an example of a .dat file :

https://drive.google.com/file/d/180VuU7XO9suUK0v8G1mlQu_ZGRdUHD0z/view?usp=sharing

Color mode is GrayScale

Characteristics of the camera

  • 1
    The first thing is to get the make/model of camera/device that created the image so you can find its dimensions, if it's greyscale or colour. The next is to find the size of one of your `.dat` file in bytes. Then try and deduce how it is laid out. Click [edit] and update your question when you have details. – Mark Setchell Mar 07 '22 at 17:40
  • if it is uknown file with data then first you should open in `bytes mode` - `open( ..., "rb")` - but you probably opened it in standard `text mode` which tries to convert bytes to unicode. – furas Mar 07 '22 at 21:26
  • thank you for your answers. @MarkSetchell I have updated my question. I hope that will make things clearer. Thanks again for your help – Ata Krichene Mar 08 '22 at 18:21

1 Answers1

1

According to the AndorSDK and some reverse engineering, the 38.dat file contains a 4 byte integer, specifying the length of the data to follow, followed by the Mono12Packed encoded frame data and some additional metadata, all in little-endian byte order.

The frame data from a sequence of .DAT files can be decoded and written to a multi-page TIFF file without precision loss. Some metadata from the DAT files is lost:

import glob
import numpy
import tifffile

width = 2560
height = 2160

datfiles = glob.glob('*.dat')

with tifffile.TiffWriter('datfiles.tif', bigtiff=False) as tif:
    for datfile in datfiles:
        data = numpy.fromfile(
            datfile,
            count=width * height * 3 // 2,  # 12 bit packed
            offset=4,  # 4 byte integer header
            dtype=numpy.uint8,
        ).astype(numpy.uint16)
        image = numpy.zeros(width * height, numpy.uint16)
        image[0::2] = (data[1::3] & 15) | (data[0::3] << 4)
        image[1::2] = (data[1::3] >> 4) | (data[2::3] << 4)
        image.shape = height, width
        tif.write(
            image, photometric='minisblack', compression=None, metadata=None
        )

Depending on the total size of the frame data, use compression and/or the BigTIFF format. When saving the frame data as AVI, some data will be lost.

cgohlke
  • 9,142
  • 2
  • 33
  • 36
  • 1
    Top sleuthing! I’ll delete my *best guess* answer now you’ve done the work. – Mark Setchell Mar 09 '22 at 20:11
  • Thank you very much this was very useful? I have some question about reverse engineering : How did you know that it contains a a 4 byte integer and the encoding? – Ata Krichene Mar 10 '22 at 08:46
  • The Mono12Packed encoding is described on page 35 of the SDK documentation. The file is not large enough to hold unpacked Mono12 data. Using offsets != 4 yields obviously wrong or shifted images. Using a HEX editor reveals that the first 4 bytes contain an integer containing a value equal to the file size minus 4 bytes. – cgohlke Mar 10 '22 at 16:54