The situation is as following:
With the following piece of code:
import re
content = ''
count = len(re.split('\W+', content, flags=re.UNICODE))
print(count)
# Output is expected to be 0, as it has no words
# Instead output is 1
What is going wrong? All other word counts are correct.
EDIT: It also happens when we use a string content = '..'
or content = '.!'
thus this in NOT a problem related in any sense with python's split()
function but with the regular expressions from re
.
IMPORTANT NOTE: Although the solution I gave works in my particular case the correct solution is not yet met. Because it's an regex issue which isn't yet 100% SOLVED!