18

I'm trying to find the most pythonic way to split a string like

"some words in a string"

into single words. string.split(' ') works ok but it returns a bunch of white space entries in the list. Of course i could iterate the list and remove the white spaces but I was wondering if there was a better way?

jonathan topf
  • 7,897
  • 17
  • 55
  • 85

6 Answers6

41

Just use my_str.split() without ' '.


More, you can also indicate how many splits to perform by specifying the second parameter:

>>> ' 1 2 3 4  '.split(None, 2)
['1', '2', '3 4  ']
>>> ' 1 2 3 4  '.split(None, 1)
['1', '2 3 4  ']
K Z
  • 29,661
  • 8
  • 73
  • 78
15

How about:

re.split(r'\s+',string)

\s is short for any whitespace. So \s+ is a contiguous whitespace.

codaddict
  • 445,704
  • 82
  • 492
  • 529
8

Use string.split() without an argument or re.split(r'\s+', string) instead:

>>> s = 'some words in a string   with  spaces'
>>> s.split()
['some', 'words', 'in', 'a', 'string', 'with', 'spaces']
>>> import re; re.split(r'\s+', s)
['some', 'words', 'in', 'a', 'string', 'with', 'spaces']

From the docs:

If sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace. Consequently, splitting an empty string or a string consisting of just whitespace with a None separator returns [].

ThiefMaster
  • 310,957
  • 84
  • 592
  • 636
3
>>> a = "some words in a string"
>>> a.split(" ")
['some', 'words', 'in', 'a', 'string']

split parameter is not included in the result, so i guess theres something more about your string. otherwise, it should work

if you have more than one whitespace just use split() without parameters

>>> a = "some words in a string     "
>>> a.split()
['some', 'words', 'in', 'a', 'string']
>>> a.split(" ")
['some', 'words', 'in', 'a', 'string', '', '', '', '', '']

or it will just split a by single whitespaces

Samuele Mattiuzzo
  • 10,760
  • 5
  • 39
  • 63
2

The most Pythonic and correct ways is to just not specify any delimiter:

"some words in a string".split()

# => ['some', 'words', 'in', 'a', 'string']

Also read: How can I split by 1 or more occurrences of a delimiter in Python?

flppv
  • 4,111
  • 5
  • 35
  • 54
0
text = "".join([w and w+" " for w in text.split(" ")])

converts large spaces into single spaces

yet
  • 773
  • 11
  • 19