-1

I have .dat file with this type of text

Example: АqMA ЅA Ђ‰ї HB HB MA @ЅA Е€ї HB HB ЂXLA ЂЅA U­‡ї HB HB АFA U5ЅA Е€ї HB HB @ю@A ЅA ё€ї HB HB [VA ЅA ±“‡ї HB HB @3MA ЅA U=‰ї HB HB А KA «JЅA Ђ‰ї HB HB ЂdJA ;1ЅA р‡ї HB HB АZA «jЅA `†ї HB HB АFA ±ЅA Uе†ї HB HB А¬XA ЅA bЗ€ї HB HB АHA OlЅA «2€ї HB HB А5WA UЅA vB‰ї HB HB АN>A ЅA Uu€ї HB HB >FA UuЅA Ы†ї HB HB А^A ±ЅA «2€ї HB HB А”\A UuЅA OL€ї HB HB ЂГOA OlЅA р‡ї HB HB @аEA UЅA Ђ‰ї HB HB @QHA OlЅA р‡ї HB HB АeOA ЅA vB‰ї HB HB цQA ЕNЅA Ђ‰ї HB HB @QHA ЂЅA Ђ‰ї HB HB ,IA ЂЅA Ђ‰ї HB HB @эUA «*ЅA Ђ‰ї HB HB DEA ЕNЅA Ђ‰ї HB HB ЂҐTA ЂЅA Ђ‰ї HB HB АоMA ЂЅA «ъ€ї HB HB @ TA ЂЅA Ђ‰ї HB HB А0CA ЂЅA Ђ‰ї HB HB @KIA ЂЅA Ђ‰ї HB HB ЂXA ЂЅA Ђ‰ї HB HB АYQA «jЅA Ђ‰ї HB HB @жDA ;1ЅA Ђ‰ї HB HB IYA ЂЅA Ђ‰ї HB HB @oCA ЂЅA Ђ‰ї HB HB ЂFOA ЂЅA Ђ‰ї HB HB ТWA Щ ЅA Ђ‰ї HB HB OA UuЅA Ђ‰ї HB HB @WGA etc...

It should be matrix (..., 10) with correct float numbers, but idk how to do it. I tried to find what kind of record it is, but i couldn't find anything same :(

Pls help

ps

in matlab it magicaly works fine, but i really can't do the same in Python

FileID = fopen('sample.dat' ,'r');
Data = fread(FileID,[10,inf],'float')';

Data in matlab pic

Niko Föhr
  • 28,336
  • 10
  • 93
  • 96
Kovgan
  • 3
  • 2
  • it's just data, and interpreting it as text will be fairly meaningless.. you should look into python's [`struct`](https://docs.python.org/3/library/struct.html) library in order to unpack raw binary data into data, as well as [`numpy`](https://numpy.org/doc/stable/reference/generated/numpy.frombuffer.html), and [`array`](https://docs.python.org/3/library/array.html) – Aaron Jan 13 '22 at 05:01
  • Your picture doesn't open. – AI - 2821 Jan 13 '22 at 05:01
  • @Aaron , It shows up now, it was due to a DNS misconfiguration. – AI - 2821 Jan 13 '22 at 05:06
  • How was the .dat file created? If it was created with matlab (using dat instead on .mat), you could use `from scipy.io import loadmat; data = loadmat('sample.dat')` – Niko Föhr Jan 13 '22 at 05:07
  • 1
    @np8 the [matlab example shows how the file can be interpreted](https://www.mathworks.com/help/matlab/ref/fread.html)... it's just packed binary floats reshaped into 10 columns by however many rows. – Aaron Jan 13 '22 at 05:09

1 Answers1

1

The data is simply a bunch of floats represented by their actual binary data rather than text. Numpy is probably the fastest and easiest by using:

numpy.fromfile(filepath, dtype=float).reshape([-1,10])

It may be a good exercise to learn how binary data is represented by using the struct library. Here's a quick example to try and read through:

with open(filepath, 'rb') as datafile:
    my_array = list()
    for i, value in enumerate(struct.iter_unpack('f', datafile)):
        if i%10 == 0: #start a new row every 10 columns
            my_array.append(list())
        my_array[-1].append(value)
Aaron
  • 10,133
  • 1
  • 24
  • 40
  • Thank u, it worked for me to do something similar to, but still very far away, because numbers are not the same as they should be. I'll try to use this struct library like you said) – Kovgan Jan 13 '22 at 05:22
  • try playing around with data type, byte endianness, and starting offset. There should be no difference. Also look and make sure the data isn't just transposed... iirc matlab is column first, and numpy is row first – Aaron Jan 13 '22 at 05:23
  • import numpy as np data = np.fromfile('sample3.dat', dtype = 'float32').reshape([-1,10]) – Kovgan Jan 13 '22 at 07:41