Splitting on a character which should always be at the beginning of a group

Question

I would like to split a string based on the following rules:

if it contains 0 bullets (•), return the whole string
if it contains 1 or more bullets, return the string up until the bullet, and then a new group starting with each bullet.

Example:

"Python is: • Great language • Better than Java • From 1991"

Should return 4 groups:

["Python is: ", "• Great language ", "• Better than Java ", "• From 1991"]

I tried using this regex:

re.split('[^•](.+?)[•$]')

But since the bullet is a boundary, if it finds one match ending in a bullet, it doesn't see the next string as beginning in one.

How can I solve this?

`re.findall(r'(?:•|^).*?(?=•|$)', text)`. Or use `re.split(r'(?=•)', text)` which is more "meta". — Wiktor Stribiżew, May 22 '23 at 08:35
Like this? `•[^•\n]+|[^•\n]+` https://regex101.com/r/cLy0xg/1 — The fourth bird, May 22 '23 at 08:35
Thanks Wiktor! Yes, it's that positive lookahead which I wasn't aware of. Your second answer is so short and works perfectly. Question has been dupe-hammered (correctly I guess) otherwise would have been happy to award you the points... — Josh Friedlander, May 22 '23 at 08:40
I am sure [the thread](https://stackoverflow.com/q/61464503/3832970) does provide the necessary details. It is a very common technique, so I do not think an answer on this thread is needed. — Wiktor Stribiżew, May 22 '23 at 08:47

0 Answers0