1

I would like to search for a sequence of 0s inside my string, starting and ending with 1. For example,

for 100001 function should print out: 100001 for 1000101 function should print out: 10001 and 101

I tried to accomplish it using regular expressions, but my code fails to do so.

#include <iostream>
#include <regex>



int main(int argc, char * argv[]){

     std::string number(argv[1]);
     std::regex searchedPattern("1?[0]+1");

     std::smatch sMatch;

     std::regex_search(number,sMatch,searchedPattern);

     for(auto& x : sMatch){
         std::cout << x << std::endl;
     }

     return 0;
}

The command, that I'm using to compile the code on the Linux(Ubuntu version 18.04):

g++ Cpp_Version.cpp -std=c++14 -o exec
./exec 1000101

g++ version:

g++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

The output is:

10001

I quess that my pattern is wrong. Any ideas how to improve it?

K. Rallen
  • 21
  • 3
  • Why is the first `1` optional? Also, you can simplify the current regex to just `10+1`. – MonkeyZeus Oct 28 '20 at 19:12
  • 2
    Something like https://ideone.com/sUYIAm? – Wiktor Stribiżew Oct 28 '20 at 19:14
  • Sorry, but I don't understand. Substring should start with 1 and end with 1. Number of zeros between does not matter. But 1s at the start and at the end of the substring are compulsory. – K. Rallen Oct 28 '20 at 19:16
  • Wiktor Stribizew perfect! That is what I was hoping for. But could you explain to me your pattern? searchedPattern("(?=(10+1))"); – K. Rallen Oct 28 '20 at 19:18
  • 2
    @WiktorStribiżew Please add that as an answer. It's the same amount of effort as the comment, and will be more useful to others. – cigien Oct 28 '20 at 19:19
  • @WiktorStribiżew The dupe only answers half the question. The OP's regex is still wrong, and now no one else can add an answer. – cigien Oct 28 '20 at 19:22
  • The regex must use a lookahead expression in order to work, as the linked answer says. The needed regex is: `"10+(?=1)"` – prapin Oct 28 '20 at 20:37
  • **Duplicate of https://stackoverflow.com/questions/41099513/c-regex-for-overlapping-matches.** – Wiktor Stribiżew Nov 03 '20 at 20:34

1 Answers1

0

std::regex_search does not search for all of the results. Use std::sregex_iterator instead. Its documentation states (emphasis mine):

On construction, and on every increment, it calls std::regex_search

#include <iostream> // std::cout, std::cerr
#include <regex> // std::regex, std::smatch, std::regex_search, std::sregex_iterator
#include <cstdlib> // EXIT_FAILURE, EXIT_SUCCESS

int main(int argc, char **argv) {
    if (argc < 2) {
        std::cerr << "./a.out 1000101" << std::endl;
        return EXIT_FAILURE;
    }
    std::string n{argv[1]};
    std::regex p{"(?=(1[0]+1))"};
    std::smatch m;
    if (false == std::regex_search(n, m, p)) {
        std::cerr << "regex_search has no match!" << std::endl;
        return EXIT_FAILURE;
    }
    std::cout << "regex_search found " << m.size() << " matches! But this is misleading...\n";
    for (const auto & field : m) {
        const auto begin = std::distance(n.cbegin(), field.first);
        const auto end = begin + std::distance(field.first, field.second);
        std::cout
            << "[" << begin << "," << end << "]\t"
            << field << "\n";
    }
    std::cout << "Unfortunately `sregex_iterator` can't tell you how many matches.\n";
    for (std::sregex_iterator it{n.cbegin(), n.cend(), p}, end{}; it != end; ++it) {
        m = *it;
                // m[0] is the capture for the lookahead. it is always empty, but it is needed to have an overlapping match group.
                // m[1] is the capture of your param.
        for (const auto & field : m) {
            const auto begin = std::distance(n.cbegin(), field.first);
            const auto end = begin + std::distance(field.first, field.second);
            std::cout
                << "[" << begin << "," << end << "]\t"
                << field << "\n";
        }
    }
    return EXIT_SUCCESS;
}

Here's the output:

$ g++ --version
g++ (GCC) 10.2.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ g++ -std=c++20 -O2 -Wall -pedantic example.cpp && ./a.out 1000100101
regex_search found 2 matches! But this is misleading...
[0,0]
[0,5]   10001
Unfortunately `sregex_iterator` can't tell you how many matches.
[0,0]
[0,5]   10001
[4,4]
[4,8]   1001
[7,7]
[7,10]  101
inetknght
  • 4,300
  • 1
  • 26
  • 52