3

Suppose that I have a unsigned char*, let's call it: some_data

unsigned char* some_data;

And some_data has url-like data in it. for example:

"aasdASDASsdfasdfasdf&Foo=cow&asdfasasdfadsfdsafasd"

I have a function that can grab the value of 'foo' as follows:

// looks for the value of 'foo'
bool grabFooValue(const std::string& p_string, std::string& p_foo_value)
{
    size_t start = p_string.find("Foo="), end;
    if(start == std::string::npos)
        return false;

    start += 4;
    end = p_string.find_first_of("& ", start);

    p_foo_value = p_string.substr(start, end - start);
    return true;
}

The trouble is that I need a string to pass to this function, or at least a char* (which can be converted to a string no problem).

I can solve this problem by casting:

reinterpret_cast<char *>(some_data)

And then pass it to the function all okie-dokie

...

Until I used valgrind and found out that this can lead to a subtle memory leak.

Conditional jump or move depends on uninitialised value(s)  __GI_strlen

From what I gathered, it has to do with the reinterpret casting messing up the null indicating the end of the string. Thus when c++ tries to figure out the length of the string thing's get screwy.

Given that I can't change the fact that some_data is represented by an unsigned char*, is there a way to go about using my grabFooValue function without having these subtle problems?

I'd prefer to keep the value-finding function that I already have, unless there is clearly a better way to rip the foo-value out of this (sometimes large) unsigned char*.

And despite the unsigned char* some_data 's varying, and sometimes large size, I can assume that the value of 'foo' will be somewhere early on, so my thoughts were to try and get a char* of the first X characters of the unsigned char*. This could potentially get rid of the string-length issue by having me set where the char* ends.

I tried using a combination of strncpy and casting but so far no dice. Any thoughts?

jCuga
  • 1,523
  • 3
  • 16
  • 28
  • 1
    `reinterpret_cast<>()` does nothing with respect to string management. All it does is change the type without changing the underlying bits. This is in constrast with `static_cast<>()`, which changes the type and may modify the bits so the conversion will be meaningful (e.g. `int` to `double`). – In silico Feb 18 '11 at 23:55
  • Looks like a false positive from val-grind. – Martin York Feb 19 '11 at 00:11

1 Answers1

6

You need to know the length of the data your unsigned char * points to, since it isn't 0-terminated.

Then, use e.g:

std::string s((char *) some_data, (char *) some_data + len);
Erik
  • 88,732
  • 13
  • 198
  • 189
  • Who said it wasn't null-terminated? – Oliver Charlesworth Feb 18 '11 at 23:57
  • 1
    His error message, which indicates that a strlen call (likely from std::string implementation) is depending on uninitialized data. If this error comes and goes based on whether he casts to char * and converts to std::string, it seems pretty clear to me that the data can't be 0-terminated – Erik Feb 19 '11 at 00:13
  • 1
    @Oli: It's the most likely explanation for valgrind detecting a problem. – Ben Voigt Feb 19 '11 at 00:14
  • If you take a unsigned char* with some text in it and cout it, and then try reinterpret_cast-ing it to a char* and cout-ing it again, the casted string will have a bunch of garbage printed out at the end, so to me that seems to indicate that the null character gets messed up from the casting. – jCuga Feb 21 '11 at 14:22
  • @jCuga There must be another problem causing that, because reinterpret_casting from unsigned char * to char* shouldn't have any effect on the null terminating byte. Technically the behavior is unspecified, but it would be very surprising. Look for an underlying cause and fix that. – bames53 Apr 04 '12 at 18:21