3

For receiving a raw protocol with custom headers Ethernet frame , I am reading in the bytes from Ethernet using a streambuf buffer. The payload gets copied successfully for the most part, but I need to check a specific byte of the frame header in the buffer so I can handle certain corner cases, but unable to figure out how to get the specific byte, and how to get it into an integer. Here is the code:

boost::asio::streambuf read_buffer;

boost::asio::streambuf::mutable_buffers_type buf = read_buffer.prepare(bytesToGet);
bytesRead = d_socket10->receive(boost::asio::buffer(buf, bytesToGet));
read_buffer.commit(bytesRead);

const char *readData = boost::asio::buffer_cast<const char*>( read_buffer.data() + 32 ); 

I need to get the length byte that would be at address 20. I've tried doing stuff with stringstream, memcpy and casting, but I don't have a handle on that, either getting compile errors or its not doing what I thought it should do.

How can I get the byte from the offset I need and cast it to a byte or short? The size is actually 2 bytes, but in this specific case, one of those bytes should be zero, so either getting 1 byte or 2 bytes would be ideal.

Thanks!

J. Doe
  • 43
  • 6

1 Answers1

3

Welcome to parsing.

Welcome to binary data.

Welcome to portable network protocols.

Each of these three subjects are their own thing to get a handle on.

The simplest thing would be to read into a buffer and use that. Use Boost Endian to remove portability concerns.

Here's the simplest thing I can think of using just standard library things (ignoring endianness):

Live On Coliru

#include <boost/asio.hpp>
#include <istream>
#include <iostream>

namespace ba = boost::asio;

void fill_testdata(ba::streambuf&);

int main() {
    ba::streambuf sb;
    fill_testdata(sb);

    // parsing starts here
    char buf[1024];
    std::istream is(&sb);
    // read first including bytes 20..21:
    is.read(buf, 22);
    size_t actual = is.gcount();

    std::cout << "stream ok? " << std::boolalpha << is.good() << "\n";
    std::cout << "actual: " << actual << "\n";
    if (is && actual >= 22) { // stream ok, and not a short read
        uint16_t length = *reinterpret_cast<uint16_t const*>(buf + 20);
        std::cout << "length: " << length << "\n";

        std::string payload(length, '\0');
        is.read(&payload[0], length);
        actual = is.gcount();

        std::cout << "actual payload bytes: " << actual << "\n";
        std::cout << "stream ok? " << std::boolalpha << is.good() << "\n";
        payload.resize(actual);

        std::cout << "payload: '" << payload << "'\n";
    }
}

// some testdata
void fill_testdata(ba::streambuf& sb) 
{
    char data[] = { 
        '\x00', '\x00', '\x00', '\x00', '\x00', // 0..4
        '\x00', '\x00', '\x00', '\x00', '\x00', // 5..9
        '\x00', '\x00', '\x00', '\x00', '\x00', // 10..14
        '\x00', '\x00', '\x00', '\x00', '\x00', // 15..19
        '\x0b', '\x00', 'H'   , 'e'   , 'l'   , // 20..24
        'l'   , 'o'   , ' '   , 'w'   , 'o'   , // 25..29
        'r'   , 'l'   , 'd'   , '!'   ,         // 30..33
    };
    std::ostream(&sb).write(data, sizeof(data));
}

Prints

stream ok? true
actual: 22
length: 11
actual payload bytes: 11
stream ok? true
payload: 'Hello world'

Increase \x0b to \x0c to get:

stream ok? true
actual: 22
length: 12
actual payload bytes: 12
stream ok? true
payload: 'Hello world!'

Increasing it to more than is in the buffer, like '\x0d gives a failed (partial) read:

stream ok? true
actual: 22
length: 13
actual payload bytes: 12
stream ok? false
payload: 'Hello world!'

Let's Go Pro

To go pro, I'd use a library like e.g. Boost Spirit. This understands about endianness, does validations and really shines when you get branches in your parser, like

 record = compressed_record | uncompressed_record;

Or

 exif_tags = .... >> custom_attrs;

 custom_attr  = attr_key >> attr_value;
 custom_attrs = repeat(_ca_count) [ custom_attrs ];

 attr_key = bson_string(64);     // max 64, for security
 attr_value = bson_string(1024); // max 1024, for security

 bson_string %= omit[little_dword[_a=_1]] 
             >> eps(_a<=_r) // not exceeding maximum
             >> repeat(_a) [byte_];

But that's noodling far ahead. Let's do a much simpler demo:

Live On Coliru ¹

#include <boost/asio.hpp>

#include <istream>
#include <iostream>

namespace ba = boost::asio;

void fill_testdata(ba::streambuf&);

struct FormatData {
    std::string signature, header; // e.g. 4 + 16 = 20 bytes - could be different, of course
    std::string payload;           // 16bit length prefixed
};

FormatData parse(std::istream& is);

int main() {
    ba::streambuf sb;
    fill_testdata(sb);

    try {
        std::istream is(&sb);
        FormatData data = parse(is);

        std::cout << "actual payload bytes: " << data.payload.length() << "\n";
        std::cout << "payload: '" << data.payload << "'\n";
    } catch(std::runtime_error const& e) {
        std::cout << "Error: " << e.what() << "\n";
    }
}

// some testdata
void fill_testdata(ba::streambuf& sb) 
{
    char data[] = { 
        'S'   , 'I'   , 'G'   , 'N'   , '\x00'   , // 0..4
        '\x00', '\x00', '\x00', '\x00', '\x00'   , // 5..9
        '\x00', '\x00', '\x00', '\x00', '\x00'   , // 10..14
        '\x00', '\x00', '\x00', '\x00', '\x00'   , // 15..19
        '\x0b', '\x00', 'H'   , 'e'   , 'l'      , // 20..24
        'l'   , 'o'   , ' '   , 'w'   , 'o'      , // 25..29
        'r'   , 'l'   , 'd'   , '!'   , // 30..33
    };
    std::ostream(&sb).write(data, sizeof(data));
}

//#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;

BOOST_FUSION_ADAPT_STRUCT(FormatData, signature, header, payload)

template <typename It>
struct FileFormat : qi::grammar<It, FormatData()> {
    FileFormat() : FileFormat::base_type(start) {
        using namespace qi;

        signature  = string("SIGN");     // 4 byte signature, just for example
        header     = repeat(16) [byte_]; // 16 byte header, same

        payload   %= omit[little_word[_len=_1]] >> repeat(_len) [byte_];
        start      = signature >> header >> payload;

        //BOOST_SPIRIT_DEBUG_NODES((start)(signature)(header)(payload))
    }
  private:
    qi::rule<It, FormatData()> start;
    qi::rule<It, std::string()> signature, header;

    qi::_a_type _len;
    qi::rule<It, std::string(), qi::locals<uint16_t> > payload;
};

FormatData parse(std::istream& is) {
    using it = boost::spirit::istream_iterator;

    FormatData data;
    it f(is >> std::noskipws), l;
    bool ok = parse(f, l, FileFormat<it>{}, data);

    if (!ok)
        throw std::runtime_error("parse failure\n");

    return data;
}

Prints:

actual payload bytes: 11
payload: 'Hello world'

¹ What a time to be alive! Coliru swamped and wandbox down, simultaneously! Had to remove Boost Asio for the online demo because IdeOne doesn't link Boost System

sehe
  • 374,641
  • 47
  • 450
  • 633
  • Thanks very much! I used your first example and have it tested and working. Much appreciated. – J. Doe Oct 27 '17 at 04:06