0

I am tring to read couple of bytes from byteData as mentioned below in my C++ code. The actual value within byteData is a binary blob byte array in BIG-ENDIAN byte order format. So I cannot simply just "cast" the byte array into a String..

byteData byte array is composed of these three things -

First is `schemaId` which is of two bytes (short datatype in Java)
Second is `lastModifiedDate` which is of eight bytes (long datatype in Java)
Third is the length of actual `byteArray` within `byteData` which we need from `byteData`.
Fourth is the actual value of that `byteArray` in `byteData`.

Now I am trying to extract the above particular information from the byteData in C++... Somehow I am able to extract schemaId but the value which is coming is wrong.. And I am not sure how to extract other things from it...

uint16_t schemaId;
uint64_t lastModifiedDate;
uint16_t attributeLength;
const char* actual_binary_value;

while (result.next()) {
    for (size_t i = 0; i < result.column_count(); ++i) {
        cql::cql_byte_t* byteData = NULL;
        cql::cql_int_t size = 0;
        result.get_data(i, &byteData, size);

        if (!flag) {

            // I cannot just "cast" the byte array into a String
            // value = reinterpret_cast<char*>(byteData);

            // now how to retrieve schemaId, lastModifiedDate and actual_binary_value from byteData?

            schemaId = *reinterpret_cast<uint16_t*>(byteData);

            flag = false;
        }
    }

// this prints out 65407 somehow but it should be printing out 32767
    cout<< schemaId <<endl;
}

If somebody needs to see my java code then this is my java code -

    byte[] avroBinaryValue = text.getBytes();

    long lastModifiedDate = 1289811105109L;
    short schemaId = 32767;

    int size = 2 + 8 + 4 + avroBinaryValue.length; // short is 2 bytes, long 8 and int 4

    ByteBuffer bbuf = ByteBuffer.allocate(size); 
    bbuf.order(ByteOrder.BIG_ENDIAN);

    bbuf.putShort(schemaId);
    bbuf.putLong(lastModifiedDate);
    bbuf.putInt(avroBinaryValue.length);
    bbuf.put(avroBinaryValue);

    // merge everything into one bytearray.
    byte[] bytesToStore = bbuf.array();

            Hex.encodeHexString(bytesToStore)

Can anybody help me what wrong I am doing in my C++ code and why I am not able to extract schemaId properly from it and other fields as well?

Update:-

After using this -

schemaId = ntohs(*reinterpret_cast<uint16_t*>(data));

I started getting the value back properly for schemaId.

But now how to extract other things such as lastModifiedDate, length of actual byteArray withinbyteDataand actual value of thatbyteArrayinbyteData`.

I was using this for lastModifiedDate but it doesn't work somehow--

std::copy(reinterpret_cast<uint8_t*>(byteData + 2), reinterpret_cast<uint8_t*>(byteData + 10), lastModifiedDate);
AKIWEB
  • 19,008
  • 67
  • 180
  • 294
  • Am I supposed to use `ntohl` here? – AKIWEB Oct 15 '13 at 22:18
  • `ntohs` is probably the way to go. Added bonus: It will construct the correct value regardless of the endianness of the system the code is running on. – IInspectable Oct 15 '13 at 22:24
  • @IInspectable: Thanks that makes sense now.. Any idea how should I extract other fields from it as mentioned in my question? – AKIWEB Oct 15 '13 at 22:47
  • If all fields are stored in big endian format you need to apply the transformation to all fields, using the respective `ntoh` variant. For an 8-byte field you would have to use `ntohll`. I don't know whether this is a standard C++ function or not. – IInspectable Oct 15 '13 at 22:55
  • @llnspectable: Thanks for suggestion.. I have tried a lot to extract lastModifiedDate and other things from bytearray but everytime I am getting wrong results.. Can you help me on this? – AKIWEB Oct 15 '13 at 23:45

1 Answers1

2

32767 is 0x7fff. 65407 is 0xff7f. Note that the high order and low order bytes are swapped. You need to swap those bytes to restore the number to the original value. Fortunately, there is a macro or function called ntohs (network to host short) that does exactly what you want. Whether this is a macro or function, and in which header it is defined, depends on your system. But the name of the macro/function is always ntohs, whether one is using Windows, Linux, Sun, or a Mac.

On a little endian machine, this macro or function swaps the two bytes that form a 16 bit integer. On a big endian machine, this macro/function does nothing (which is exactly what is wanted). Note that most home computers nowadays are little endian.

David Hammen
  • 32,454
  • 9
  • 60
  • 108
  • Why I need to swap the bytes? I am storing it in BIG-ENDIAN byte order format on Java side right? So it should work with ntohl or not? Correct me if my understanding is wrong? – AKIWEB Oct 15 '13 at 22:26
  • You need to swap bytes in your C++ code because your machine is little endian and because the number is stored in big endian order by your Java code.It will not work with `ntohl`. That final "l" means "long" - 32 bits. You are storing a short -- 16 bits. You use `ntohs` for that (the "s" stands for "short"). – David Hammen Oct 15 '13 at 22:32
  • Thanks, now it makes sense to me slightly.. Apart from that how to figure out whether my machine is BIG-ENDIAN or LITTLE-ENDIAN? – AKIWEB Oct 15 '13 at 22:34
  • You don't need to know. On a little endian machine, `ntohs` and its kin (`htons`, `ntohl`, `htonl`) perform the needed byte swapping to convert between host order and network order. On a big endian machine, those functions simply return the input value as the output. Those functions are portable; they do the right thing regardless of the host machine architecture. – David Hammen Oct 15 '13 at 22:38
  • Sure.. Thanks a lot for the help... Now coming back to my second question - How can I extract `lastModifiedDate` and other things as well? I tried doing it like this - `std::copy(reinterpret_cast(byteData + 2), reinterpret_cast(byteData + 10), lastModifiedDate)` but it doesn't work for me? Can you see what wrong I am doing? – AKIWEB Oct 15 '13 at 22:40
  • It's the same issue as before. You are storing it in big endian order, so if your machine isn't big endian (which it isn't), you have to convert that stored value to host order before you can use it. This one is trickier. There is no "standard" function for byte swapping a 64 bit integer. Google "byte swap 64 bit integer " (substitute with windows, Linux, or OSX, etc.) and you'll find something specific to your machine. And now you may well need to know how to find out what type of machine you have. – David Hammen Oct 15 '13 at 22:54
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/39312/discussion-between-trekkietechie-t-t-and-david-hammen) – AKIWEB Oct 15 '13 at 23:45