You might have better luck with a better example. If you do this:
Diff::LCS.diff('ab cd', 'a- c_')
Then the output looks like this (with the noise removed):
[
[
<@action="-", @position=1, @element="b">,
<@action="+", @position=1, @element="-">
], [
<@action="-", @position=4, @element="d">,
<@action="+", @position=4, @element="_">
]
]
If we look at Diff::LCS.diff('ab cd ef', 'a- c_ e+')
, then we'd get three inner arrays instead of two.
What possible reason could there be for this? There are three operations in a diff:
- Add a string.
- Remove string.
- Change a string.
A change is really just a combination of removes and adds so we're left with just remove and add as the fundamental operations; these line up with the @action
values quite nicely. However, when humans look at diffs, we want to see a change as a distinct operation, we want to see that b
has become -
, the "remove b
, add -
" version is an implementation detail.
If all we had was this:
[
<@action="-", @position=1, @element="b">,
<@action="+", @position=1, @element="-">,
<@action="-", @position=4, @element="d">,
<@action="+", @position=4, @element="_">
]
then you'd have to figure out which +/-
pairs were really changes and which were separate additions and removals.
So the inner arrays map the two fundamental operations (add, remove) to the three operations (add, remove, change) that humans want to see.
You might want to examine the structure of the outputs from these as well:
Diff::LCS.diff('ab cd', 'a- x c_')
Diff::LCS.diff('ab', 'abx')
Diff::LCS.diff('ab', 'xbx')
I think an explicit change @action
for Diff::LCS::Change
would be better but at least the inner arrays let you group the individual additions and removals into higher level edits.