And how do I make it do that?
Right now it stops at line breaks (like right after "Chicago,"). Alternatively, if I use DOTALL it just matches "Abbott A (1988)" and then the rest of the string till the very end. I would like it to stop at the next occurrence of (([\w\s]+)(([1|2]\d{3}))), that is ... "Albu OB and Flyverbom M (2016)". And so on and so forth.
Any pointers welcome.
pattern = r"(([\w\s]+)\(([1|2]\d{3})\))(.*)"
sample string
"Abbott A (1988) The System of Professions: An Essay on the Division of Expert Labor. Chicago,
IL: University of Chicago Press.
Albu OB and Flyverbom M (2016) Organizational transparency: conceptualizations, con-
ditions, and consequences. Business & Society. Epub ahead of print 13 July. DOI:
10.1177/0007650316659851.
Ananny M (2016) Toward an ethics of algorithms: convening, observation, probability, and timeli-
ness. Science, Technology & Human Values 41(1): 93–117. DOI: 10.1177/0162243915606523."
sandbox here