0
int integers;
std::list<int> integersList = {};
string token;

while(iss >> token)
{
    if(stringstream(token) >> integers)
    {
        integersList.push_back(integers);
    }
}

One of the tokens I need to parse is

U<sub>54778</sub><br

The istringstream doesn’t tokenise the integer inside of it, it only splits in along the spaces.

All the other integer tokens in the string are separated by spaces but this one is not.

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • 2
    If you need to tokenize strings then `istringstream` is not the best choice. You better use regular expressions or boost.Tokenizer – Slava Aug 23 '22 at 03:50
  • See [std::basic_string::find_first_of](https://en.cppreference.com/w/cpp/string/basic_string/find_first_of) (with a `str` (characters to search for) of `"0123456789"`) and then [std::basic_string::find_first_not_of](https://en.cppreference.com/w/cpp/string/basic_string/find_first_not_of) to find the next non-digit. Or, simply `#include ` and loop over the characters in the string until you find the first that satisfies the condition `str[i] && isdigit(str[i])` – David C. Rankin Aug 23 '22 at 05:35
  • As mentioned [std::basic_regex](https://en.cppreference.com/w/cpp/regex/basic_regex) is also a good approach. – David C. Rankin Aug 23 '22 at 05:40

1 Answers1

0

This will most probably work with a stringstream in an easy way.

As written in the comments, it is better to use a "regex" for this. Very fortunately C++ has a build-in "regex" library. You may read here about it.

And even better, there is an iterator, with which you can iterate over all patterns in a std::string: the std::sregex_token_iterator.

This gives you very powerful possibilities, because you can match many different patterns by a regex. Please read about it here.

With that, you can come up with a very simple program like the below:

#include <iostream>
#include <string>
#include <list>
#include <regex>
#include <algorithm>

// Simple Regex for optional sign and digits
const std::regex re{ "[+-]?\\d+" };

int main() {
    // The test string
    std::string test{ "U<sub>54778</sub><br+123ddd 4 -55 66" };

    // Here we will store our integers
    std::list<int> integersList{};

    // Get all integers
    std::transform(std::sregex_token_iterator(test.begin(), test.end(), re), {}, std::back_inserter(integersList), [](const std::string& s) { return std::stoi(s); });

    // Show debug output
    for (const int i : integersList) std::cout << i << ' ';
}

Please note: If you need to validate the correct format or range for an integer, then the regex will get more complicated.

A M
  • 14,694
  • 5
  • 19
  • 44