Short Answer
Your Python program outputs 64-bit integers, not 32-bit integers which you are trying to read with your C program.
You can change the following line of code:
b = np.random.randint(100, size=(5,7), dtype=np.int32)
Now you will see 32-bit integers in the output file.
How to Tell What Your Python Code Outputs
Your Python code dumps 64-bit integers based on the following analysis of a hexdump
of your output file. Of course, you can examine the binary data file with any hex editor application.
$ hexdump random_from_python_int.dat
0000000 09 00 00 00 00 00 00 00 0f 00 00 00 00 00 00 00
0000010 40 00 00 00 00 00 00 00 1c 00 00 00 00 00 00 00
0000020 59 00 00 00 00 00 00 00 5d 00 00 00 00 00 00 00
As @ndim points out in his answer, two's complement integer representation consists of three major elements: [storage] size, endianness and signedness. I will not repeat information which he provides in his answer except to show how to deduce those from the above output which was what I started to do in my original answer.
In your case of multi-dimensional arrays, you may also need to know the order of elements in linear storage.
Deducing Integer Storage Size
Since you indirectly specify the maximum non-inclusive random value of (decimal) 100
from np.random.randint()
, your values will be in the decimal range [0, 100)
, or [0x0, 0x64)
in hexadecimal which can all be represented in a single "hex byte". Note that none of the non-00
hex bytes in the above hexdump
outputs are outside this range. As you can see, there are a total of 8 bytes used to represent each integer value (1 non-00
-byte and 7 00
-bytes based on the range of numbers in this case).
Deducing Endianness
Furthermore, you also now can deduce the endianness of the integer representation, which is little endian in this case as the least significant bit (LSB) is part of the first byte in linear storage. The LSB may also be referred to as the least signficant byte.
Deducing Signedness
In this case, you cannot deduce signedness, because you have no negative values in your sampling. If you did, in two's complement representation, you would see a value of 1
for the signed bit. I won't delve into the details of two's complement negative integer representation, which would be off-topic for this question.
Deducing Multi-dimensional Array Order
An examination of the first two 8-byte, little endian integers in the above output starting at file offset (0x
) 0000000
(and 0000008
which is not labeled) are hexadecimal values 0x00000000 00000009
and 0x00000000 0000000f
, which are the decimal values of 9
and 15
respectively. The decimal value 9
would be the first value in either row-major order or column-major order, but the second decimal value in linear storage being 15
indicates row-major ordering as the row elements are in contiguous storage.
The hexadecimal value of the third integer's value located at file offset (0x
) 0000010
is 0x00000000 00000040
which in decimal is the numeric value 64
. This value is the third value in your expected output in row-major order.
For completeness, column-major order would output the decimal value of 8
as the second integer represented in linear storage.
How to Make Numpy Dump 32-bit Numbers in Your Python Code
To make your code dump 32-bit numbers, which is a common implementation length of int
(but it is "implementation defined" in the C standard which only specifies a minimum range for int
to represent), you can change the following line of code:
b = np.random.randint(100, size=(5,7), dtype=np.int32)
Now you will see 32-bit integers in the output file.
$ hexdump random_from_python_int.dat
0000000 09 00 00 00 0f 00 00 00 40 00 00 00 1c 00 00 00
0000010 59 00 00 00 5d 00 00 00 1d 00 00 00 08 00 00 00
0000020 49 00 00 00 00 00 00 00 28 00 00 00 24 00 00 00
NOTE: The actual storage size (precision) of C int
variables is "implementation defined", which means you may need to adjust the numpy
array integer storage size before output for maximum compatibility with C. See @ndim's excellent answer that provides more detail regarding this.
Changes to Your C Code
Your C code must be updated to reflect the change in data types for the two-dimensional array. In your code, double randn[5][7]
should be int randn[5][7]
. You could also make the type int32_t
as @ndim pointed out, but your compiler may issue an error and suggest the data type __int32_t
(which is a typedef
for int
on my system). After making that change and compililing, I get the following output:
9 15 64 28 89 93 29
8 73 0 40 36 16 11
54 88 62 33 72 78 49
51 54 77 69 13 25 13
92 86 30 30 89 12 65
UPDATE (See also UPDATE #2)
Per @ndim's comment below, you can also use np.intc
as below. This option is likely the best option unless you are targeting a specific storage size for integer representation.
b = np.random.randint(100, size=(5,7), dtype=np.intc)
I tested this and it also produces 32-bit integers as well.
UPDATE #2
I totally agree with @ndim that specifying the integer size is best for maximizing compatibility. The Python idiom of "least surprise" applies here.