0

I am reading the python difflib documentation. According to the difflib.differ output is :

Code Meaning '- ' line unique to sequence 1 '+ ' line unique to sequence 2 ' ' line common to both sequences '? ' line not present in either input sequence

I am also reading this question on stackoverflow Python Difflib - How to Get SDiff Sequences with "Change" Op but not be able to add a comment on the Sнаđошƒаӽ's answer.

I dont' know what is the Perl's Sdiff pretty much, but I need to adjust this function :

def sdiffer(s1, s2):
    differ = difflib.Differ()
    diffs = list(differ.compare(s1, s2))

    i = 0
    sdiffs = []
    length = len(diffs)
    while i < length:
        line = diffs[i][2:]
        if diffs[i].startswith('  '):
            sdiffs.append(('u', line))

        elif diffs[i].startswith('+ '):
            sdiffs.append(('+', line))

        elif diffs[i].startswith('- '):
            if i+1 < length and diffs[i+1].startswith('? '): # then diffs[i+2] starts with ('+ '), obviously
                sdiffs.append(('c', line))
                i += 3 if i + 3 < length and diffs[i + 3].startswith('? ') else 2

            elif diffs[i+1].startswith('+ ') and i+2<length and diffs[i+2].startswith('? '):
                sdiffs.append(('c', line))
                i += 2
            else:
                sdiffs.append(('-', line))
        i += 1
    return sdiffs

to be like PHP Diff Class

above function I've try and it return the value of UNCHANGE, ADDED and DELETED. DELETED is more complex with 4 difference cases which is :

Case 1: The line modified by inserting some characters

- The good bad
+ The good the bad
?          ++++

Case 2: The line is modified by deleting some characters

- The good the bad
?          ----
+ The good bad

Case 3: The line is modified by deleting and inserting and/or replacing some characters:

- The good the bad and ugly
?      ^^ ----
+ The g00d bad and the ugly
?      ^^          ++++

Case 4: The line is deleted

- The good the bad and the ugly
+ Our ratio is less than 0.75!

I don't know how to tweak within this code

elif diffs[i].startswith('- '):
        if i+1 < length and diffs[i+1].startswith('? '): # then diffs[i+2] starts with ('+ '), obviously
            sdiffs.append(('c', line))
            i += 3 if i + 3 < length and diffs[i + 3].startswith('? ') else 2

        elif diffs[i+1].startswith('+ ') and i+2<length and diffs[i+2].startswith('? '):
            sdiffs.append(('c', line))
            i += 2
        else:
            sdiffs.append(('-', line))

to skip the '?' line. I just want to append(-) only if no new line inserted and append with (+) if invoke new line inserted.

Community
  • 1
  • 1

1 Answers1

0

I think I've done what I want like a PHP Diff output.

def sdiffer(s1, s2):
    differ = difflib.Differ()
    diffs = list(differ.compare(s1, s2))

    i = 0
    sdiffs = []
    length = len(diffs)
    sequence = 0
    while i < length:
        line = diffs[i][2:]
        if diffs[i].startswith('  '):
            sequence +=1
            sdiffs.append((sequence,'u', line))

        elif diffs[i].startswith('+ '):
            sequence +=1
            sdiffs.append((sequence,'+', line))

        elif diffs[i].startswith('- '):
            sequence +=1
            sdiffs.append((sequence,'-',diffs[i][2:]))
            if i+1 < length and diffs[i+1].startswith('? '):
                if diffs[i+3].startswith('?') and i+3 < length : # case 3
                    sequence +=1
                    sdiffs.append((sequence,'+',diffs[i+2][2:]))
                    i+=3
                elif diffs[i+2].startswith('?') and i+2 < length: # case 2
                    sequence +=1
                    sdiffs.append((sequence,'+',diffs[i+2][2:]))
                    i+=2
            elif diffs[i+1].startswith('+ ') and i+2<length and diffs[i+2].startswith('? '): # case 1
                sequence +=1
                sdiffs.append((sequence,'+', diffs[i+1][2:]))
                i += 2
            else: # the line is deleted and inserted new line # case 4
                sequence +=1
                sdiffs.append((sequence,'+', diffs[i+1][2:]))
                i+=1   
        i += 1
    return sdiffs