-3

The original txt file composed of multiple lines say in the following format:

Q1:
Number of responses: 100
Answers:
A. Python
B. Java
C. JavaScript

Q2:
...

What I did:

import re
file = 'file.txt'

text = open(file, "r", encoding='utf-8-sig').read()
textList = [i for i in textList if i != ""]
length = len(textList)
flag = length * [0]

pattern = re.compile(r'Q\d+')
for i in range(length):
    matches = pattern.findall(textList[i])
    if len(matches) > 0:
        if matches[0] == textList[i]:
            flag[i] = 1
    if textList[i] == 'Answers:':
        flag[i] = 2

I wonder how am I supposed to make it into a json format like this:

{
    'Q1': {
         'Number of responses': 100,
         'Answer' : ['A. Python','B. Java','C. JavaScript']
    }
    'Q2': {
         ...
    }
}

2 Answers2

2

Assuming that your individual answers are always separated by two newlines, you can

# make a dictionary
answers = dict()

# split at double newlines to get individual questions
for q in data.split('\n\n'):
    # split each question into lines,
    # take the 1st line, 2nd line, 3rd line, and all the rest
    q, responses, _, *ans = q.splitlines()

    # and add it to the dict
    answers[q] = ans

Result:

{'Question1: What do you do for fun?': ['A. Watching movies',
  'B. Doing sports',
  'C. Chat with friends'],
 'Question2: Why?': ['A. Foo', 'B. Bar', 'C. Foobar']}
fsimonjetz
  • 5,644
  • 3
  • 5
  • 21
1

You can try regex

# Assume the file content as follows

# Question1: What do you do for fun?
# Number of responses: 100
# Answers:
# A. Watching movies
# B. Doing sports
# C. Chat with friends

# Question2: What do you do for fun?
# Number of responses: 100
# Answers:
# A. Watching movies
# B. Doing sports
# C. Chat with friends

import re
data = open('file.txt').read()

output = {}

for i in re.findall(r'^(Question\d+).*\n.*\nAnswers:\n((?:^\w[\w. ]+\n)+)', data, re.MULTILINE):
    output[i[0]] = i[1].strip().split('\n')
    
print(output)
{'Question1': ['A. Watching movies', 'B. Doing sports', 'C. Chat with friends'], 'Question2': ['A. Watching movies', 'B. Doing sports']}
Epsi95
  • 8,832
  • 1
  • 16
  • 34