0

I am reading binary files in R and need to read 10 bytes, which must be interpreted as 4 bit unsigned integers (2 per byte, so 20 values in range 0..15 I guess).

From my understanding of the docs, this cannot done with readBin directly because the minimal length to read, 1, means 1 byte.

So I think I need to read the data as 1 byte integers and use bit-wise operations to get the 4 bit integers. I found out that the values are stored as 32 bit integers internally by R, and I found this explanation on SO that seems to describe what I want to do. So here is my attempt at an R function that follows the advice:

#' @title Interprete bits start_index to stop_index of input int8 as unsigned integer.
uint8bits <- function(int8, start_index, stop_index) {
num_bits = stop_index - start_index + 1L;
bitmask = bitwShiftL((bitwShiftL(1L, num_bits) -1L), stop_index);
return(bitwShiftR(bitwAnd(int8, bitmask), start_index));
}

However, it does not work as intended, e.g, to get the two numbers out of the read value (255 in this example), I would call the function once to extract bits 1 to 4, and once more for bits 5 to 8:

value1 = uint8bits(255L, 1, 4); # I would expect 15, but the output is 120.
value2 = uint8bits(255L, 5, 8); # I would expect 15, but the output is 0.

What am I doing wrong?

spirit
  • 441
  • 5
  • 14
  • How are your binary files stored? Are you able to demonstrate what you want with another language? What does `readBin()` return? – Hugh Oct 25 '20 at 09:36
  • They are stored as little endian, if that is what you mean. I think it is demonstrated for C in the linked thread. The [documentation for readbin is here](https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/readBin), and when I call it like as `val = readBin(myfilehandle, integer(), n=1, size=1, endian='little', signed = FALSE)`, val is an integer value in range 0..255. But R uses 32bit integers internally, so it is **not** stored as an 8 bit integer. – spirit Oct 25 '20 at 09:52

1 Answers1

1

We can use the packBits function to achieve your expected behaviour:

uint8.to.uint4 <- function(int8,start_index,stop_index)
{
  bits <- intToBits(int8)
  out <- packBits(c(bits[start_index:stop_index],
             rep(as.raw(0),32-(stop_index-start_index+1))),type="integer")
  return(out)
}

uint8.to.uint4(255L,1,4)
[1] 15

We first convert the integer to a bit vector, then extract the bits you like and pad the number with 0 to achieve the 32bit internal storage length for integers (32 bits). Then we can just convert with the packBits function back to an integer

Julian_Hn
  • 2,086
  • 1
  • 8
  • 18