You may use the python regexp module to split your string, and find the occurrences of substrings composed of one or more digits.
import re
pattern = re.compile("([0-9]+)")
s = "foo bar Jan-01 03-56, blah"
toks = pattern.split(s)
# toks is ['foo bar Jan-', '01', ' ', '03', '-', '56', ', blah']
If your format is exactly "MMM-DD,YYYY", then you may use something like this (adapted from the question). I assume you are trying to extract the day out of this?
def get_day_number(line):
month_day, year = line.split(",", 1) # '1' splits at most once
month, day = month_day.split("-", 1)
return int(day, 10)
The octal number problem you mention won't happen until you attempt to convert a string to an integer using int(s)
. You can force the integer conversion to use decimal by specifying a base explicitly, a general good practice in python.
s = "010"
i = int(s, 10)
print i # 10