0

I have a problem with boost tokenizer, here is my code:

#include <iostream>
#include <vector>
#include <boost/tokenizer.hpp>

using namespace std;

static vector<std::string> tokenize(const std::string& input, const char delim) {
    std::cout << "Tokenize: " << input << std::endl;
    vector<std::string> vector;
    typedef boost::char_separator<char> TokenizerSeparator;
    typedef boost::tokenizer<TokenizerSeparator> Tokenizer;
    TokenizerSeparator separator(&delim);
    Tokenizer tokenizer(input, separator);
    Tokenizer::iterator iterator;

    for(iterator=tokenizer.begin(); iterator!=tokenizer.end();++iterator){
        std::cout << "elem found: " + *iterator << std::endl;
        vector.push_back(*iterator);
    }
    return vector;
}

int main(int argc, const char * argv[])
{
    string input = "somedata,somedata,somedata-somedata;more data;more data";
    vector<string> list = tokenize(input, ';');

    return 0;
}

This code does not behave consistently all the time. Some times it works, some times not when run multiple times. When it doesn't work here is one output I get:

Tokenize: somedata,somedata,somedata-somedata;more data;more data
elem found: some
elem found: ata,some
elem found: ata,some
elem found: ata-some
elem found: ata
elem found: more 
elem found: ata
elem found: more 
elem found: ata

What am I doing wrong ?

Thanks.

Alexandre
  • 807
  • 2
  • 9
  • 15

2 Answers2

3
TokenizerSeparator separator(&delim);

You are tokenizing based on the address the character was stored at rather than the value of the character.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • The constructor of `char_separator` wants a `const char*` as argument. It doesn't compile if I write `TokenizerSeparator separator(delim);` – Alexandre Jun 10 '12 at 23:36
  • 1
    @Alexandre, Most likely it wants a `"null-terminated string"`, not a single char. – chris Jun 10 '12 at 23:41
  • 2
    As the [example](http://www.boost.org/doc/libs/1_39_0/libs/tokenizer/char_separator.htm) shows, `boost::char_separator sep("-;|");` – David Schwartz Jun 10 '12 at 23:44
  • @DavidSchwartz, Thanks, I haven't used boost yet, so I was going off of presumption. – chris Jun 10 '12 at 23:44
  • Here is the template definition of char_separator `template ::traits_type > class char_separator {...}` So it wants a single char. – Alexandre Jun 10 '12 at 23:45
0

Thanks to @DavidSchwartz for the answer (see comments above).

char_separator needs a valid C string in its constructor.

Alexandre
  • 807
  • 2
  • 9
  • 15