1

Problem: Find all vowels (more than 2) that are sandwiched between two consonants. These vowels can come at beginning or end of line. Example:-

input :-

abaabaabaabaae

expected output :-

['aa','aa','aa','aae']

solution Tried

import re
pattern=re.compile(r'(?:[^aeiouAEIOU])([AEIOUaeiou]{2,})(?=[^AEIOUaeiou])')
pattern.findall("abaabaabaabaae")

This gives output as ['aa','aa','aa'] , it ignores 'aae' for obvious reason as end of line is not part of search criteria. How can I include an anchor - end of line ($) inclusive search such that it($) is an OR condition in the search and not an mandatory end of line.

sakeesh
  • 919
  • 1
  • 10
  • 24

2 Answers2

1

You can extract matches of the regular expression

re'(?<=[b-df-hj-np-tv-z])[aeiou]{2,}(?=[b-df-hj-np-tv-z]|$)'

Demo

For the following string the matches are indicated.

_abaab_aabaabaaeraaa_babaa%abaa
   ^^     ^^ ^^^             ^^

I found it easiest to explicitly match consonants with the character class

[b-df-hj-np-tv-z]

Python demo

Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
0

I would use re.findall with the pattern (?<=[^\Waeiou])[aeiou]+(?![aeiou]):

inp = "abaabaabaabaae"
matches = re.findall(r'(?<=[^\Waeiou])[aeiou]+(?![aeiou])', inp, flags=re.IGNORECASE)
print(matches)

This prints:

['aa', 'aa', 'aa', 'aae']

Here is an explanation of the regex pattern:

(?<=[^\Waeiou])  assert that what precedes is any word character, excluding a vowel
                 this also exlcudes the start of the input
[aeiou]+         match one or more vowel characters
(?![aeiou])      assert that what follows is not a vowel (includes end of string)
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360