1

I have a large 40 mb (about 173,397 lines) .dat file filled with binary data (random symbols). It is an astronomical photograph. I need to read and display it with Python. I am using a binary file because I will need to extract pixel value data from specific regions of the image. But for now I just need to ingest it into Python. Something like the READU procedure in IDL. Tried numpy and matplotlib but nothing worked. Suggestions?

mgalloy
  • 2,356
  • 1
  • 12
  • 10
John
  • 21
  • 4
  • 1
    Well to start I'd suggest going into more detail. For instance, what did you try in numpy and matplotlib, and why didn't it work? Can you show the code that didn't work as well as the stacktrace of any errors you ran into? What else have you tried, if anything? And what do you mean by "display"? Just display it on the screen in any way possible? – Random Davis Dec 04 '18 at 17:54
  • 1
    I think regular old Python is capable of reading 40 mb worth of data into a variable: `with open("mydata.dat", "rb") as file: data = file.read()`. So "ingesting" is easy. _Interpreting_ that data may be hard. Do you know what file format it is in? – Kevin Dec 04 '18 at 17:58
  • @Kevin Yes I meant interpreting and visualizing. What do you mean by format? It is a .dat file. – John Dec 04 '18 at 18:17
  • files with the .dat extension do not have a single well-documented file format. It's basically the "grab bag" extension name for data. If you're looking for a library that can take a .dat file and find e.g. the pixel value at column 23, row 42, no such library exists, because any two dat files may store that value in completely different positions and encodings. I'm asking if you know what file format it is in, because if you don't, then nobody does. – Kevin Dec 04 '18 at 18:43
  • @Kevin so it is unformatted binary file. I guess that would be really challenging to manipulate with Python? However, we have .xml files with all the header info that I probably need to use. – John Dec 05 '18 at 12:25
  • Python is perfectly capable of reading binary data. That's exactly what the code in my first comment is doing. I'm not saying Python isn't up to the task. I'm saying _no_ programming language is up to the task if the only thing you know about the file is its name. – Kevin Dec 05 '18 at 20:51

1 Answers1

0

You need to know the data type and dimensions of the binary file. For example, if the file contains float data, use numpy.fromfile like:

import numpy as np
data = np.fromfile(filename, dtype=float)

Then reshape the array to the dimensions of the image, dims, using numpy.reshape (the equivalent of REFORM in IDL):

im = np.reshape(data, dims)
mgalloy
  • 2,356
  • 1
  • 12
  • 10