1

Long question for a very simple & rookie problem BUT yet I need some advise.

Background

So I have a binary file that I need to parse. This file starts with some magic string that contains the null char (\0). let's define is as ab\0cd.

I'm writing a method that returns true if some file starts with the magic string.

Attempt 1

#define MAGIC_STRING "ab\0cd"

bool IsMagicFile(const wpath& pathFile)
{
    string strData;
    if (!ReadFile(pathFile, strData))
        return false;

    if (strData.size() < 5)
        return false;

    string strPrefix = strData.substr(0, 5);

    if (strcmp(strPrefix.c_str(), MAGIC_STRING) != 0)
        return false;

    return true;
}

Problem 1

What bothers me with the above code is that I "hardcodedly" assume the size of the magic string is 5.

what if tomorrow the magic string changes? say:

#define MAGIC_STRING "abe\0fcd"

the string macro changed and the code no longers work properly.

Attempt 2

#define MAGIC_STRING "ab\0cd"

bool IsMagicFile(const wpath& pathFile)
{
    string strMagic = MAGIC_STRING;

    string strData;
    if (!ReadFile(pathFile, strData))
        return false;

    if (strData.size() < strMagic.size())
        return false;

    string strPrefix = strData.substr(0, strMagic.size());

    if (strcmp(strPrefix.c_str(), MAGIC_STRING) != 0)
        return false;

    return true;
}

Problem 2

I supposedly got rid of the hard-coded size problem BUT the size of strMagic is not really 5 but 2. string ends with \0

Attempt 3

#define MAGIC_STRING        "ab\0cd"    // CAUTION - MAGIC_STRING & MAGIC_STRING_SIZE must be changes together 
#define MAGIC_STRING_SIZE   5           // CAUTION - MAGIC_STRING & MAGIC_STRING_SIZE must be changes together

bool IsMagicFile(const wpath& pathFile)
{
    string strData;
    if (!ReadFile(pathFile, strData))
        return false;

    if (strData.size() < MAGIC_STRING_SIZE)
        return false;

    string strPrefix = strData.substr(0, MAGIC_STRING_SIZE);

    if (strcmp(strPrefix.c_str(), MAGIC_STRING) != 0)
        return false;

    return true;
}

Problem 3

This solved the first problem but I still don't get the seamless magic string change I wanted.

Question

Is attempt 3 good enough? do you have a better approach?

idanshmu
  • 5,061
  • 6
  • 46
  • 92
  • 2
    You have a tag `C++` in your question => std::string only, not strcmp => Solved. (And `const string` instead of `#define`) – deviantfan Jul 03 '16 at 12:10

3 Answers3

2

Instead of using the macro definition you could define a constant character array. For example

const char MAGIC_STRING[] = "abe\0fcd";

In this case the number of characters excluding the terminating zero is equal to

sizeof( MAGIC_STRING ) - 1

To compare raw bytes you can use standard C function memcmp supplying the number of compared bytes that is equal to the expression above.

Here is a demonstrative program

#include <iostream>
#include <string>
#include <cstring>
#include <iterator>

const char MAGIC_STRING[] = "abe\0fcd";

int main() 
{
    std::string s( std::begin( MAGIC_STRING ), std::prev( std::end( MAGIC_STRING ) )  );

    if ( memcmp( s.c_str(), MAGIC_STRING, sizeof( MAGIC_STRING ) - 1 ) == 0 )
    {
        std::cout << "The string starts with the MAGIC_STRING" << std::endl;
    }

    return 0;
}

Its output is

The string starts with the MAGIC_STRING
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
0

If you know for a fact that your magic string will contain \0 then you could write your own size(string str)function that will return the correct length by continuing to count after the first \0.

If it is not known how many \0s there are in the magic string I would suggest you go with attempt 3.

If you need some code to guide you in the right direction for the sizemethod, let me know.

Octoshape
  • 1,131
  • 8
  • 26
0

Personally I would avoid using MACROS. Also I would not use functions designed for null-terminated strings like std::strcmp. You could check if the beginning of a string contains a particular character sequence using std::equal from the standard <algorithm> library:

// create a character array to preserve compile time size
// but remember string literals add a null-terminator extra character
const char magic_string[] = "ab\0cd";

bool IsMagicFile(const wpath& pathFile)
{
    string strData;
    if (!ReadFile(pathFile, strData))
        return false;

    // -1 to avoid null terminator from magic_string character array
    return std::equal(magic_string, magic_string + sizeof(magic_string) - 1,
        strData.begin());
}
Galik
  • 47,303
  • 4
  • 80
  • 117
  • If the searched string does not begin with the magic string, `std::string::find()` will search the entire string before failing. I would suggest using [`std::string::compare()`](http://en.cppreference.com/w/cpp/string/basic_string/compare) instead: `return strData.compare(0, magic_string.length(), magic_string) == 0;`. Alternatively: `return strData.compare(0, 5, "ab\0cd", 5) == 0;`. Also, `magic_string` will be `"ab"` because the `\0` will be interpreted as a null terminator. You need to specify the correct length when constructing it: `const std::string magic_string("ab\0cd", 5);` – Remy Lebeau Jul 03 '16 at 17:35
  • @RemyLebeau You are totally right. I decided to go with `std::equal` to avoid constructing a `std::string` as that needed to be constructed from a character array to get the size correct. – Galik Jul 03 '16 at 18:14