5

I have the following strings:

NAME John Nash FROM California

NAME John Nash

I want a regular expression capable of extracting 'John Nash' for both strings.

Here is what I tried

"NAME(.*)(?:FROM)"
"NAME(.*)(?:FROM)?"
"NAME(.*?)(?:FROM)?"

but none of these works for both strings.

Morgan Thrapp
  • 9,748
  • 3
  • 46
  • 67
Dayvid Oliveira
  • 1,157
  • 2
  • 14
  • 34

4 Answers4

5

You can use logical OR between FROM and anchor $ :

NAME(.*)(?:FROM|$)

See demo https://regex101.com/r/rR3gA0/1

In this case after the name it will match FROM or the end of the string.But in your regex since you make the FROM optional in firs case it will match the rest of string after the name.

If you want to use a more general regex you better to create your regex based on your name possibility shapes for example if you are sure that your names are create from 2 word you can use following regex :

NAME\s(\w+\s\w+)

Demo https://regex101.com/r/kV2eB9/2

Mazdak
  • 105,000
  • 18
  • 159
  • 188
2

Make the second part of the string optional (?: FROM.*?)?, i.e.:

NAME (.*?)(?: FROM.*?)?$

MATCH 1
1.  [5-14]  `John Nash`
MATCH 2
1.  [37-46] `John Nash`
MATCH 3
1.  [53-66] `John Doe Nash`

Regex Demo
https://regex101.com/r/bL7kI2/2

Pedro Lobito
  • 94,083
  • 31
  • 258
  • 268
0
 r'^\w+\s+(\w+\s+\w+) - word at start of string
 follows by one or more spaces and
 two words and at least one space between them

with open('data', 'r') as f:
    for line in f:
      mo =   re.search(r'^\w+\s+(\w+\s+\w+)',line)
      if mo:
        print(mo.group(1))

John Nash
John Nash
LetzerWille
  • 5,355
  • 4
  • 23
  • 26
0

You can do without regex:

>>> myStr = "NAME John Nash FROM California"
>>> myStr.split("FROM")[0].replace("NAME","").strip()
'John Nash'
Mayur Koshti
  • 1,794
  • 15
  • 20