-2

I've a pattern to find some words etc in a string. Here is my code:

    pattern = {
        "eval\(.*\)",
        "hello",
        "my word"
    }

    patterns = "|" . join( pattern )
    patterns = "(^.*?(" + patterns + ").*?$)"

    code = code.strip()

    m = re.findall( patterns, code, re.IGNORECASE|re.MULTILINE|re.UNICODE )

    if m:
        return m

How can i see which of these words (eval(), hello ..) was found? In php i have the function preg_match_all to get the matched word that was found.

user3507915
  • 279
  • 3
  • 15
  • nope. when its so easy, i dont have to ask for it. I need two informations, something like this: "hello", (the word that matched) "hello my friend" (the whole line with the matching word) – user3507915 Jan 01 '15 at 20:19
  • You should really update your question to specifically ask about the second piece of info. – zehnpaard Jan 01 '15 at 20:33

2 Answers2

0

I don't know whether it's what you intended, but your regexp has two levels of capturing groups:

    (^.*?(hello|my word|eval\(.*\)).*?$)

The outer group will capture the whole line, whereas the inner group will only capture the specified words.

The re.findall method returns a list of tuples containing the captured groups. In your particular case, this will be:

    [(outer_group, inner_group), (outer_group, inner_group), ...]

To iterate over this, you could do:

    for line, words in m:
        print('line:', line)
        print('words:', words)

or to just access the items directly, do this:

    line = m[0][0]
    words = m[0][1] 

NB:

If the outer group is removed, or made non-capturing, like this:

    ^.*?(hello|my word|eval\(.*\)).*?$

or this

    (?:^.*?(hello|my word|eval\(.*\)).*?$)

there would only be one capturing group. For this specific case, re.findall will return a flat list of the matches (i.e. just single strings, not tuples).

ekhumoro
  • 115,249
  • 20
  • 229
  • 336
0
pattern = {
    "eval\(.*\)",
    "hello",
    "my word"
}
patterns = "|" . join( pattern )
patterns = "^.*?(" + patterns + ").*?$"

code = "i say hello to u"

m = re.match( patterns, code, re.IGNORECASE|re.MULTILINE|re.UNICODE )

if m:
    print m.group()  #the line that matched
    print m.group(1) #the word that matched

What you need match instead of findall.

match.group will give you the whole line matched and match.group(1) or match.group(2) in your case will give you the word.

vks
  • 67,027
  • 10
  • 91
  • 124