I have a string that can look something like this:
1. "foo bar"
2. "foo bar foo:bar"
3. "foo bar "
4. "foo bar "
5. "foo bar foo:bar:baz"
I want to split this string so that it would end up with the following results:
1. ['foo', 'bar']
2. ['foo', 'bar', 'foo', ':', 'bar']
3. / 4. ['foo', 'bar', '']
5. ['foo', 'bar', 'foo', ':', 'bar', ':', 'baz']
In other words, following these rules:
Split the string on every occurrence of a space.
a. If there are one or more spaces at the end of a string, add one empty string to the split list
b. Any spaces before the last non-space character in a string should be consumed, and not add to the split list.
Split the string on every occurrence of a colon, and do not consume the colon.
The XY problem is this, in case it's relevant:
I want to mimic Bash tab-completion behaviour. When you type a command into a Bash interpreter, it will split the command into an array COMP_WORDS
, and it will follow the above rules - splitting the words based on spaces and colons, with colons placed into their own array element, and spaces ignored unless they're at the end of a string. I want to recreate this behaviour in Python, given a string that looks like a command that a user would type.
I've seen this question about splitting a string and keeping the separators using re.split
. And this question about splitting using multiple delimiters. But my use case is more complicated, and neither question seems to cover it. I tried the following to at least split on spaces and colons:
print(re.split('(:)|(?: )', splitstr))
But even that doesn't work. When splitstr
is "foo bar foo:bar" returns this:
['foo', None, 'bar', None, 'foo', ':', 'bar']
Any idea how this could be done in Python?
EDIT: My requirements weren't clear - I would want "foo bar " (with any number of spaces at the end) to return the list ["foo", "bar", ""]
(with just one empty string at the end of the list.)