1

I have an issue. I don't manage to find a fast algorithm to put a set of integer values in an array of 0 and 1. For example, this is a portion of the txt file of the exercise:

magazzino n. 0 venditori 5, 9, 12, 19, 16, 18, 27 costo: 447
magazzino n. 1 venditori 21, 25 costo: 722
magazzino n. 2 venditori 6, 12 costo: 570
magazzino n. 3 venditori 5, 28, 7, 17, 29, 25, 10 costo: 936
magazzino n. 4 venditori 5, 28, 0, 17, 20, 22, 27, 4 costo: 635

I need to consider only values after word "venditori" and before word "costo", so I should generate a matrix where there will be values 1 at the position 5, 9, 12, 19, 16, 18, 27 (the first row) like [0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, and so on..

something can suggest me a fast method to do this?

Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
Alba
  • 5
  • 3

1 Answers1

1

Try:

import re
from ast import literal_eval

pat = re.compile(r"venditori([\s\d,]+)costo")

out = []
with open("your_file.txt", "r") as f_in:
    for line in f_in:
        m = pat.search(line)
        if m:
            numbers = literal_eval("{" + m.group(1) + "}")
            out.append([int(i in numbers) for i in range(max(numbers) + 1)])

print(out)

Prints:

[
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1], 
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1], 
[0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1], 
[0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1], 
[1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1]
]
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91