match a regular expression with optional lookahead

Question

I have the following strings:

NAME John Nash FROM California

NAME John Nash

I want a regular expression capable of extracting 'John Nash' for both strings.

Here is what I tried

"NAME(.*)(?:FROM)"
"NAME(.*)(?:FROM)?"
"NAME(.*?)(?:FROM)?"

but none of these works for both strings.

Are those both a full line? – Morgan Thrapp Oct 12 '15 at 22:08 — Morgan Thrapp, Oct 12 '15 at 22:08

Mazdak · Accepted Answer · 2015-10-12T22:23:46.290

You can use logical OR between FROM and anchor $ :

NAME(.*)(?:FROM|$)

See demo https://regex101.com/r/rR3gA0/1

In this case after the name it will match FROM or the end of the string.But in your regex since you make the FROM optional in firs case it will match the rest of string after the name.

If you want to use a more general regex you better to create your regex based on your name possibility shapes for example if you are sure that your names are create from 2 word you can use following regex :

NAME\s(\w+\s\w+)

Demo https://regex101.com/r/kV2eB9/2

Pedro Lobito · Answer 2 · 2015-10-12T23:04:50.000

2

Make the second part of the string optional (?: FROM.*?)?, i.e.:

NAME (.*?)(?: FROM.*?)?$

MATCH 1
1.  [5-14]  `John Nash`
MATCH 2
1.  [37-46] `John Nash`
MATCH 3
1.  [53-66] `John Doe Nash`

Regex Demo
https://regex101.com/r/bL7kI2/2

edited Oct 12 '15 at 23:04

answered Oct 12 '15 at 22:58

Pedro Lobito

94,083
31
258
268

score 0 · Answer 3 · answered Oct 12 '15 at 22:14

 r'^\w+\s+(\w+\s+\w+) - word at start of string
 follows by one or more spaces and
 two words and at least one space between them

with open('data', 'r') as f:
    for line in f:
      mo =   re.search(r'^\w+\s+(\w+\s+\w+)',line)
      if mo:
        print(mo.group(1))

John Nash
John Nash

score 0 · Answer 4 · answered Oct 13 '15 at 08:11

0

You can do without regex:

>>> myStr = "NAME John Nash FROM California"
>>> myStr.split("FROM")[0].replace("NAME","").strip()
'John Nash'

answered Oct 13 '15 at 08:11

Mayur Koshti

1,794
15
20

match a regular expression with optional lookahead

4 Answers4