1

I have a file in the DIMACS cnf format that I need to manipulate into the necessary format for a SAT Solver.

Specifically, I need to get:

['c horn? no', 'c forced? no', 'c mixed sat? no', 'c clause length = 3', 'c', 'p cnf 20  91', '4 -18 19 0', '3 18 -5 0', '-5 -8 -15 0', '-20 7 -16 0']

to

[[4,-18,19,0], [3,18,-5,0],[-5,-8,-15,0],[-20,7,-16,0]]

Thanks for the help!

user3357979
  • 607
  • 1
  • 5
  • 12
  • Have you looked at using the http://pydoc.net/Python/sympy/0.7.1/sympy.logic.utilities.dimacs/ module to load the file? – erik-e Mar 06 '15 at 00:29
  • A typical DIMACS parser loops through all lines, ignores the 'c' comment lines, extracts the number of clauses and variables from the 'p' line, and finally splits all remaining clause lines into arrays of literals. It does not make sense to store the final '0' as this is just and end-of-clause marker. – Axel Kemper Mar 06 '15 at 09:43

1 Answers1

3

as a quick hack you can simply use

in_data = ['c horn? no', 'c forced? no', 'c mixed sat? no', 'c clause length = 3', 'c', 'p cnf 20  91', '4 -18 19 0', '3 18 -5 0', '-5 -8 -15 0', '-20 7 -16 0']
out_data = [[int(n) for n in line.split()] for line in in_data if line[0] not in ('c', 'p')]
print(out_data)

which will output

[[4, -18, 19, 0], [3, 18, -5, 0], [-5, -8, -15, 0], [-20, 7, -16, 0]]

however, you might want to use something like

out_data = [[int(n) for n in line.split() if n != '0'] for line in in_data if line[0] not in ('c', 'p')]

instead to remove the terminating zeros from the clauses:

[[4, -18, 19], [3, 18, -5], [-5, -8, -15], [-20, 7, -16]]

but a real dimacs parser should actually use the terminating zero, instead of assuming one clause per line. so here is a proper dimacs parser:

in_data = ['c horn? no', 'c forced? no', 'c mixed sat? no', 'c clause length = 3', 'c', 'p cnf 20  91', '4 -18 19 0', '3 18 -5 0', '-5 -8 -15 0', '-20 7 -16 0']

cnf = list()
cnf.append(list())
maxvar = 0

for line in in_data:
    tokens = line.split()
    if len(tokens) != 0 and tokens[0] not in ("p", "c"):
        for tok in tokens:
            lit = int(tok)
            maxvar = max(maxvar, abs(lit))
            if lit == 0:
                cnf.append(list())
            else:
                cnf[-1].append(lit)

assert len(cnf[-1]) == 0
cnf.pop()

print(cnf)
print(maxvar)
CliffordVienna
  • 7,995
  • 1
  • 37
  • 57