0

I'm trying find longest common sequence from text file, which contains string lines. Output should be also text file with align lines like in example:

find sequence - efghijk

output file:

abcdefghijklmno     
  dfefghijkrumlp    
 swrefghijkawsfce   
wsveefghijksxl  

I'm thinking about use difflib, save lines to list and then compare list[0] and list[1], find longest sequence from this two strings and then difflib(None, sequence, list[2]) etc.

But I have some trouble with coding this and I absolutely dont know how to do the output file.

Thanks for advice, Jan

1 Answers1

0

Printing the output is pretty easy. Imagine that you've already got the positions where the longest common substring begins in the strings. For your example it'd be [4, 2, 3, 4]. Now just shift all the strings by max(begins) - begins[i] - it indents the strings correctly.

strings = ("abcdefghijklmno", "dfefghijkrumlp", "swrefghijkawsfce", "wsveefghijksxl")
positions = (4, 2, 3, 4)

maxpos = max(positions)

for i in range(len(strings)):
    print (" " * (maxpos - positions[i])) + strings[i]
Danstahr
  • 4,190
  • 22
  • 38