2

I was wondering if anyone can help me, I've been looking around for regex examples but I still can't get my head over it.

The strings look like this:

"User JaneDoe, IP: 12.34.56.78"

"User JohnDoe, IP: 34.56.78.90"

How would I go about to make an expression that matches the above strings?

andand
  • 17,134
  • 11
  • 53
  • 79
Floyd
  • 53
  • 1
  • 2
  • 7

2 Answers2

4

The question is how exactly do you want to match these, and what else do you want to exclude?

It's trivial (but rarely useful) to match any incoming string with a simple .*.

To match these more exactly (and add the possibility of extracting things like the user name and/or IP), you could use something like: "User ([^,]*), IP: (\\d{1,3}(\\.\\d{1,3}){3})". Depending on your input, this might still run into a problem with a name that includes a comma (e.g., "John James, Jr."). If you have to allow for that, it gets quite a bit uglier in a hurry.

Edit: Here's a bit of code to test/demonstrate the regex above. At the moment, this is using the C++0x regex class(es) -- to use Boost, you'll need to change the namespaces a bit (but I believe that should be about all).

#include <regex>
#include <iostream>

void show_match(std::string const &s, std::regex const &r) { 
    std::smatch match;
    if (std::regex_search(s, match, r))
        std::cout << "User Name: \"" << match[1] 
                  << "\", IP Address: \"" << match[2] << "\"\n";
    else
        std::cerr << s << "did not match\n";
}

int main(){ 

    std::string inputs[] = {
        std::string("User JaneDoe, IP: 12.34.56.78"),
        std::string("User JohnDoe, IP: 34.56.78.90")
    };

    std::regex pattern("User ([^,]*), IP: (\\d{1,3}(\\.\\d{1,3}){3})");

    for (int i=0; i<2; i++)
        show_match(inputs[i], pattern);
    return 0;
}

This prints out the user name and IP address, but in (barely) enough different format to make it clear that it's matching and printing out individual pieces, not just passing entire strings through.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • Wow thank you for the great example! Exactly what I was looking for. I'm still a bit confused with the meaning of this part of the expression though ([^,]*). I hope you don't mind explaining – Floyd Feb 22 '11 at 05:35
  • 1
    @Floyd: No problem. A `[]` normally means "match whatever I put between the brackets. For example `[a-z]` means "match any lower case letter." If, however, the first character is a `^`, it inverts the match -- i.e., match anything *but* whatever else is there. In this case, the only other character is a comma, so it means "match any character except a comma". Followed by an "*", it means "match characters up to, but not including, a comma." – Jerry Coffin Feb 22 '11 at 05:43
  • can u explain (\\d{1,3}(\\.\\d{1,3}){3}) this part in your expression – karthik Feb 22 '11 at 06:32
3
#include <string> 
#include <iostream>
#include <boost/regex.hpp>

int main() {

    std::string text = "User JohnDoe, IP: 121.1.55.86";
    boost::regex expr ("User ([^,]*), IP: (\\d{1,3}(\\.\\d{1,3}){3})");

    boost::smatch matches;

    try
    {
        if (boost::regex_match(text, matches, expr)) {

            std::cout << matches.size() << std::endl;

            for (int i = 1; i < matches.size(); i++) {
                std::string match (matches[i].first, matches[i].second);
                std::cout << "matches[" << i << "] = " << match << std::endl;
            }

        }
        else {
            std::cout << "\"" << expr << "\" does not match \"" << text << "\". matches size(" << matches.size() << ")" << std::endl;
        }
    } 
    catch (boost::regex_error& e)
    {
        std::cout << "Error: " << e.what() << std::endl;
    }

    return 0;
}

Edited: Fixed missing comma in string, pointed out by Davka, and changed cmatch to smatch

Floyd
  • 53
  • 1
  • 2
  • 7
  • you are missing the comma in the string after the user. BTW, if you use `smatch` instead of `cmatch`, you can the `string` as-is, without extracting the `c_str()` – davka Feb 22 '11 at 07:11
  • @davka Thank you, edited my post to fix the error and added your recommendation to use smatch instead – Floyd Feb 22 '11 at 07:22