5

I have been attempting to convert binary data I get from an API in R, but it is not parsing and converting the values correct.

Here is a sample of the binary:

00 00 00 01 00 04 53 42 55 58 00 00 00 25 c8 42 9b cc cd 42 9c 8a 3d 42 9b b8 52 42 9c 23 d7 44 bd 5e 14 00 00 01 43 53 5c 62 40

The results should be:

SBUX    77.9    78.27   77.86   78.07   1153261076  1/2/2014 9:30

Sample code used with the correct data types and sizes

readBin(file2read[1:4],integer(),n=1, size=4) #Symbol Count
readBin(file2read[5:6],integer(),n=1,size=2) #Symbol length
readBin(file2read[7:10],character(),n=4) #Sympbol = SBUX
readBin(file2read[11],integer(),n=1,size=1) #Error code
readBin(file2read[12:15],integer(),n=4) #Bar Count
readBin(file2read[16:19],double(),n=4,size=4) #close
readBin(file2read[20:23],double(),n=1,size=4) #high
readBin(file2read[24:27],double(),n=1,size=4) #low
readBin(file2read[28:31],double(),n=1,size=4) #open
readBin(file2read[32:36],double(),n=1,size=4) #volume
readBin(file2read[37:44],character(),n=1,size=8) #timestamp

but it is not generating the target result listed above.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Bryan Nice
  • 53
  • 3

1 Answers1

3

OK. I think i have everything figured out but the date/time. First, here is your binary data

rr<-as.raw(c(0x00, 0x00, 0x00, 0x01, 0x00, 0x04, 0x53, 0x42, 0x55, 
0x58, 0x00, 0x00, 0x00, 0x25, 0xc8, 0x42, 0x9b, 0xcc, 0xcd, 0x42, 
0x9c, 0x8a, 0x3d, 0x42, 0x9b, 0xb8, 0x52, 0x42, 0x9c, 0x23, 0xd7, 
0x44, 0xbd, 0x5e, 0x14, 0x00, 0x00, 0x01, 0x43, 0x53, 0x5c, 0x62, 
0x40))

I'm just going to write to a file to make it easier to read though the data with readBin. (Also because it appears the symbol is variable-length so the indicies of values after it may differ; the file connection will keep track of which byte is next.) Here I write it to disc, then open it

writeBin(rr,"test.bin")
zz <- file("test.bin", "rb")

Now I read the values

(nrec<-readBin(zz, "integer", size=4, endian="big"))
(charsize<-readBin(zz, "integer", size=2, signed=F, endian="big"))
(symbol<-readChar(zz, charsize))
(err<-readBin(zz, "integer", size=1, signed=F))
(bcount<-readBin(zz, "integer",size=4, endian="big"))
(sclose<-readBin(zz, "double",size=4, endian="big"))
(shigh<-readBin(zz, "double",size=4, endian="big"))
(slow<-readBin(zz, "double",size=4, endian="big"))
(sopen<-readBin(zz, "double",size=4, endian="big"))
(svol<-readBin(zz, "integer",size=4, endian="big"))
(sdate<-readBin(zz, "integer",size=4, n=2, endian="big"))
#done
close(zz)

So the barcount variable wasn't in your output, but it appears to have a value of 9672. Now the date is a bit tricky. It's stored as a 64-bit integer. And R doesn't like to seem to read those yet with readBin (at least not on my machine) so i read it in as two integers. You can convert it to a date with

 as.POSIXct(sdate[1]*2^32/1000 + sdate[2]/1000, origin="1970-01-01")
 # [1] "2014-01-02 09:30:00 EST"

This seems to properly extract the data. The one major gotcha was using readChar for the character, because when you use readBin with "character" it reads C-style strings so it includes the following x00. readChar does not do that. I also had to be careful to specify the endian-ness of the values because "big" is not the default of my system (I ran on a Mac).

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • @thelatemail Right. If you use the array indexes, there's no internal pointer that keeps track of where you are so you can jump around. But the width of that value depends on the previous value. So it should be `readBin(rr[7:(7+strwidth-1)],"character") ` where `strwidth` is the previous integer. – MrFlick Jul 12 '14 at 05:31
  • everything works for the binary string I pasted; however, I am having another issue with getting r to iterate through the entire binary file. When I put the code in a for loop it stops on row 247, and I am not sure of the cause. Is there some sort of limitation with readBin and file("text.bin","rb")? – Bryan Nice Jul 13 '14 at 00:46
  • @BryanNice What do you mean by "stops"? Does it throw an error? How are you tracking rows in a binary file? What's the condition on your loop to know when it's done? – MrFlick Jul 13 '14 at 00:51
  • @MrFlick I am using the bar count is my iterator limit. However, I figured out the issue. I was downloading the data directly from the API with readBin and saving it to a file. I changed it to use download.file() method and that fixed the issue. Thanks for your help. – Bryan Nice Jul 13 '14 at 02:02