Let's say I have a string that looks like so:
text = '''
{"question":"In 2017, what was the approximate number of clinics in the US that provided abortion services?","category":"RFB","answers":["80","800","8000","80000"],"sources":["https://www.guttmacher.org/fact-sheet/induced-abortion-united-states"]}
{"question":"Compared to actively religious US adults, how many unaffiliated US adults were active in non-religious voluntary organizations, such as charities?","category":"DFB","answers":["Slightly fewer (10% difference)","Slightly more (10% difference)","Many fewer (35% difference)","Many more (35% difference)"],"sources":["https://www.pewforum.org/2019/01/31/religions-relationship-to-happiness-civic-engagement-and-health-around-the-world/"]}
{"question":"In the US in 2015, there were ___ abortions per 1000 live births.","category":"DFB","answers":["12","80","124","188"],"sources":["https://www.cdc.gov/mmwr/volumes/67/ss/ss6713a1.htm?s_cid=ss6713a1_w"]}'''
I would like to convert this string into a python dictionary with the keys "question", "category", "answer", and "sources." Question
and category
will always be plaintext, whereas answers
and sources
will be in a list-like format with brackets.
I assume it will require the use of regex as in this answer with something of the form dictionary = dict(re.findall(r"\{(\S+)\s+\{*(.*?)\}+",text))
but can't quite get it to match all the keys I need.
Any thoughts?
The identified "duplicate" link doesn't solve my problem. I get "invalid syntax" error when using dictionary = ast.literal_eval(text)
, because I haven't successfully demarcated all the separate dictionaries from the string.