0

I would like to split a string based on the following rules:

  • if it contains 0 bullets (•), return the whole string
  • if it contains 1 or more bullets, return the string up until the bullet, and then a new group starting with each bullet.

Example:

"Python is: • Great language • Better than Java • From 1991"

Should return 4 groups:

["Python is: ", "• Great language ", "• Better than Java ", "• From 1991"]

I tried using this regex:

re.split('[^•](.+?)[•$]')

But since the bullet is a boundary, if it finds one match ending in a bullet, it doesn't see the next string as beginning in one.

How can I solve this?

Josh Friedlander
  • 10,870
  • 5
  • 35
  • 75
  • 1
    `re.findall(r'(?:•|^).*?(?=•|$)', text)`. Or use `re.split(r'(?=•)', text)` which is more "meta". – Wiktor Stribiżew May 22 '23 at 08:35
  • Like this? `•[^•\n]+|[^•\n]+` https://regex101.com/r/cLy0xg/1 – The fourth bird May 22 '23 at 08:35
  • Thanks Wiktor! Yes, it's that positive lookahead which I wasn't aware of. Your second answer is so short and works perfectly. Question has been dupe-hammered (correctly I guess) otherwise would have been happy to award you the points... – Josh Friedlander May 22 '23 at 08:40
  • I am sure [the thread](https://stackoverflow.com/q/61464503/3832970) does provide the necessary details. It is a very common technique, so I do not think an answer on this thread is needed. – Wiktor Stribiżew May 22 '23 at 08:47

0 Answers0