19

I need to find difference between two strings.

const string1 = 'lebronjames';
const string2 = 'lebronnjames';

The expected output is to find the extra n and log it to the console.

Is there any way to do this in JavaScript?

Penny Liu
  • 15,447
  • 5
  • 79
  • 98
Elvis S.
  • 362
  • 1
  • 3
  • 13
  • 3
    there is also b ? – Aziz.G Jul 18 '19 at 20:55
  • 1
    Can you clarify the output you expect? Are you just trying to find the first different character? Or, do you need to find all different characters? And if it's all, what sort of threshold are you using for characters after the first? – Brad Jul 18 '19 at 20:56
  • Should it work in both directions and show missing chars? – Philipp Jul 18 '19 at 20:56
  • is m = b? or is it a typo error? – Vrian7 Jul 18 '19 at 20:57
  • 1
    You should consider computing the edit distance: https://en.wikipedia.org/wiki/Edit_distance – slider Jul 18 '19 at 20:59
  • 2
    This is a far more complex operation than you may realize. How would your algorithm know that any substrings after the first n should be compared? From a strictly letter:letter standpoint, the entire second half of the string is different. – isherwood Jul 18 '19 at 20:59
  • Sorry, fixed it typo in the string2. Brad, the output is simple, find all different characters comparing string1 with string2. – Elvis S. Jul 18 '19 at 21:00
  • If you search in your browser for "string difference algorithm", you'll find references that can explain this much better than we can manage here. – Prune Jul 18 '19 at 21:21
  • Thank you for the reference you shared with me, I'll take a look. – Elvis S. Jul 18 '19 at 21:24

7 Answers7

13

Another option, for more sophisticated difference checking, is to make use of the PatienceDiff algorithm. I ported this algorithm to Javascript at...

https://github.com/jonTrent/PatienceDiff

...which although the algorithm is typically used for line-by-line comparison of text (such as computer programs), it can still be used for comparison character-by-character. Eg, to compare two strings, you can do the following...

let a = "thelebronnjamist";
let b = "the lebron james";

let difference = patienceDiff( a.split(""), b.split("") );

...with difference.lines being set to an array with the results of the comparison...

difference.lines: Array(19)

0: {line: "t", aIndex: 0, bIndex: 0}
1: {line: "h", aIndex: 1, bIndex: 1}
2: {line: "e", aIndex: 2, bIndex: 2}
3: {line: " ", aIndex: -1, bIndex: 3}
4: {line: "l", aIndex: 3, bIndex: 4}
5: {line: "e", aIndex: 4, bIndex: 5}
6: {line: "b", aIndex: 5, bIndex: 6}
7: {line: "r", aIndex: 6, bIndex: 7}
8: {line: "o", aIndex: 7, bIndex: 8}
9: {line: "n", aIndex: 8, bIndex: 9}
10: {line: "n", aIndex: 9, bIndex: -1}
11: {line: " ", aIndex: -1, bIndex: 10}
12: {line: "j", aIndex: 10, bIndex: 11}
13: {line: "a", aIndex: 11, bIndex: 12}
14: {line: "m", aIndex: 12, bIndex: 13}
15: {line: "i", aIndex: 13, bIndex: -1}
16: {line: "e", aIndex: -1, bIndex: 14}
17: {line: "s", aIndex: 14, bIndex: 15}
18: {line: "t", aIndex: 15, bIndex: -1}

Wherever aIndex === -1 or bIndex === -1 is an indication of a difference between the two strings. Specifically...

  • Element 3 indicates that character " " was found in b in position 3.
  • Element 10 indicates that character "n" was found in a in position 9.
  • Element 11 indicates that character " " was found in b in position 10.
  • Element 15 indicates that character "i" was found in a in position 13.
  • Element 16 indicates that character "e" was found in b in position 14.
  • Element 18 indicates that character "t" was found in a in position 15.

Note that the PatienceDiff algorithm is useful for comparing two similar blocks of text or strings. It will not tell you if basic edits have occurred. Eg, the following...

let a = "james lebron";
let b = "lebron james";

let difference = patienceDiff( a.split(""), b.split("") );

...returns difference.lines containing...

difference.lines: Array(18)

0: {line: "j", aIndex: 0, bIndex: -1}
1: {line: "a", aIndex: 1, bIndex: -1}
2: {line: "m", aIndex: 2, bIndex: -1}
3: {line: "e", aIndex: 3, bIndex: -1}
4: {line: "s", aIndex: 4, bIndex: -1}
5: {line: " ", aIndex: 5, bIndex: -1}
6: {line: "l", aIndex: 6, bIndex: 0}
7: {line: "e", aIndex: 7, bIndex: 1}
8: {line: "b", aIndex: 8, bIndex: 2}
9: {line: "r", aIndex: 9, bIndex: 3}
10: {line: "o", aIndex: 10, bIndex: 4}
11: {line: "n", aIndex: 11, bIndex: 5}
12: {line: " ", aIndex: -1, bIndex: 6}
13: {line: "j", aIndex: -1, bIndex: 7}
14: {line: "a", aIndex: -1, bIndex: 8}
15: {line: "m", aIndex: -1, bIndex: 9}
16: {line: "e", aIndex: -1, bIndex: 10}
17: {line: "s", aIndex: -1, bIndex: 11}

Notice that the PatienceDiff does not report the swap of the first and last name, but rather, provides a result showing what characters were removed from a and what characters were added to b to end up with the result of b.

EDIT: Added new algorithm dubbed patienceDiffPlus.

After mulling over the last example provided above that showed a limitation of the PatienceDiff in identifying lines that likely moved, it dawned on me that there was an elegant way of using the PatienceDiff algorithm to determine if any lines had indeed likely moved rather than just showing deletions and additions.

In short, I added the patienceDiffPlus algorithm (to the GitHub repo identified above) to the bottom of the PatienceDiff.js file. The patienceDiffPlus algorithm takes the deleted aLines[] and added bLines[] from the initial patienceDiff algorithm, and runs them through the patienceDiff algorithm again. Ie, patienceDiffPlus is seeking the Longest Common Subsequence of lines that likely moved, whereupon it records this in the original patienceDiff results. The patienceDiffPlus algorithm continues this until no more moved lines are found.

Now, using patienceDiffPlus, the following comparison...

let a = "james lebron";
let b = "lebron james";

let difference = patienceDiffPlus( a.split(""), b.split("") );

...returns difference.lines containing...

difference.lines: Array(18)

0: {line: "j", aIndex: 0, bIndex: -1, moved: true}
1: {line: "a", aIndex: 1, bIndex: -1, moved: true}
2: {line: "m", aIndex: 2, bIndex: -1, moved: true}
3: {line: "e", aIndex: 3, bIndex: -1, moved: true}
4: {line: "s", aIndex: 4, bIndex: -1, moved: true}
5: {line: " ", aIndex: 5, bIndex: -1, moved: true}
6: {line: "l", aIndex: 6, bIndex: 0}
7: {line: "e", aIndex: 7, bIndex: 1}
8: {line: "b", aIndex: 8, bIndex: 2}
9: {line: "r", aIndex: 9, bIndex: 3}
10: {line: "o", aIndex: 10, bIndex: 4}
11: {line: "n", aIndex: 11, bIndex: 5}
12: {line: " ", aIndex: 5, bIndex: 6, moved: true}
13: {line: "j", aIndex: 0, bIndex: 7, moved: true}
14: {line: "a", aIndex: 1, bIndex: 8, moved: true}
15: {line: "m", aIndex: 2, bIndex: 9, moved: true}
16: {line: "e", aIndex: 3, bIndex: 10, moved: true}
17: {line: "s", aIndex: 4, bIndex: 11, moved: true}

Notice the addition of the moved attribute, which identifies whether a line (or character in this case) was likely moved. Again, patienceDiffPlus simply matches the deleted aLines[] and added bLines[], so there is no guarantee that the lines were actually moved, but there is a strong likelihood that they were indeed moved.

Trentium
  • 3,419
  • 2
  • 12
  • 19
  • Awesome, thank you for sharing this with use, this really solves an issue! – Elvis S. Jul 19 '19 at 11:07
  • 1
    It dawned on me that there was an elegant way of employing the `patienceDiff` algorithm to identify the *likely* lines/characters that moved, rather than simply identifying deletions and additions. I have appended an edit to my original answer with the new `patienceDiffPlus` algorithm, and updated my GitHub repo. – Trentium Jul 23 '19 at 01:55
  • Use `.split(' ')` with a space for marking words rather than letters. We found this useful when highlighting the differences between long multi-paragraph blobs of text. Thanks for this solution! – Jan Werkhoven Mar 13 '23 at 09:22
10

this will return the first difference between two string

Like for lebronjames and lebronnjames is n

const string1 = 'lebronjames';
const string2 = 'lebronnjabes';


const findFirstDiff = (str1, str2) =>
  str2[[...str1].findIndex((el, index) => el !== str2[index])];


// equivalent of 

const findFirstDiff2 = function(str1, str2) {
  return str2[[...str1].findIndex(function(el, index) {
    return el !== str2[index]
  })];
}



console.log(findFirstDiff2(string1, string2));
console.log(findFirstDiff(string1, string2));
Ryan Shillington
  • 23,006
  • 14
  • 93
  • 108
Aziz.G
  • 3,599
  • 2
  • 17
  • 35
  • 2
    While it does not find all the differences it finds the first one, I was able to get to that part with much more code :D Can you explain your code to us (for the newbies)? – Elvis S. Jul 18 '19 at 21:07
  • no you are not, I just used `findIndex` to find the first difference between two words then i took the index string[index], with `esc6` syntax it looks shorter – Aziz.G Jul 18 '19 at 21:11
  • 1
    Thank you for your time! I was looping thought each character in string1 and string2 => comparing it. That was working and I was hoping I could find the whole sequence of different char's but I found a better way now & learned a little bit of spread operator. Thank you! – Elvis S. Jul 18 '19 at 21:20
  • This doesn't address the edge case where the change is a new character at the end of the first string. – hash Jan 13 '23 at 12:22
5

    function getDifference(a, b)
    {
        var i = 0;
        var j = 0;
        var result = "";

        while (j < b.length)
        {
         if (a[i] != b[j] || i == a.length)
             result += b[j];
         else
             i++;
         j++;
        }
        return result;
    }
    console.log(getDifference("lebronjames", "lebronnjames"));
Sebastian Waldbauer
  • 674
  • 1
  • 10
  • 17
Nero
  • 51
  • 2
4

For those who want to return the first difference between two string can adjust it like this:

Sort & Find

const getDifference = (s, t) => {
  s = [...s].sort();
  t = [...t].sort();
  return t.find((char, i) => char !== s[i]);
};

console.log(getDifference('lebronjames', 'lebronnjames'));
console.log(getDifference('abc', 'abcd'));

Add CharCodes

const getDifference = (s, t) => {
  let sum = t.charCodeAt(t.length - 1);
  for (let j = 0; j < s.length; j++) {
    sum -= s.charCodeAt(j);
    sum += t.charCodeAt(j);
  }
  return String.fromCharCode(sum);
};

console.log(getDifference('lebronjames', 'lebronnjames'));
console.log(getDifference('abc', 'abcd'));
Penny Liu
  • 15,447
  • 5
  • 79
  • 98
1
function findDifference(s, t) {

  if(s === '') return t;

  
  
  // this is useless and can be omitted.
  for(let i = 0; i < t.length; i++) {
    if(!s.split('').includes(t[i])) {
      return t[i];
    }
  }
  // this is useless and can be omitted.


  
  // (if the additional letter exists)
  // cache them, count values, different values of the same letter would give the answer.

  const obj_S = {};
  const obj_T = {};

  for(let i = 0; i < s.length; i++) {
    if(!obj_S[s[i]]) {
      obj_S[s[i]] = 1;
    }else {
      obj_S[s[i]]++;
    }
  }
  
  for(let i = 0; i < t.length; i++) {
    if(!obj_T[t[i]]) {
      obj_T[t[i]] = 1;
    }else {
      obj_T[t[i]]++;
    }
  }

  for(const key in obj_T) {
    if(obj_T[key] !== obj_S[key]) {
      return key
    }
  }

}

// if more than 1 letter -> store the values and the return, logic stays the same.

console.log(findDifference('john', 'johny')) // --> y
console.log(findDifference('bbcc', 'bbbcc')) //--> b

actually the first part can be omitted(first for loop) my solution to the edge case solves the whole problem because if the value does not exist it will be undefined and count !== undefined would return the letter...

emre-ozgun
  • 676
  • 1
  • 7
  • 17
1

var findTheDifference = function(s, t) {
  let res = [...s].sort();
  let res1 = [...t].sort();
  let j = 0;
  while (j < res1.length) {
    if (res[j] != res1[j]) {
      return res1[j];
    }
    j++;
  }
};

console.log(findTheDifference("a", "aa"))
Penny Liu
  • 15,447
  • 5
  • 79
  • 98
0

My approach to obtain differences between two strings. I hope this function will help somebody:

function getDifferences(a, b){
  
  let result = {
    state : true,
    diffs : []
  }

  if(a===b) return result;
  
  result.state = false;

  for (let index = 0; index < Math.max(a.length,b.length); index++) {
    if (a[index] !== b[index]) {
        result.diffs.push({index: index, old: a[index], new: b[index]})
    }
  }
  
  return result;
}
  • 1
    Wouldn't this only compare two strings of the exact same length. If `b` is longer than `a`, wouldn't all the difference after the end of `a` stop being pushed into the `result.diffs` array? – Brandon Tom Apr 22 '23 at 14:04
  • @BrandonTom you're right. My main idea was to find differences in a string individually, ignoring what is left over, but your reflection makes sense and the function can be misleading. Editing to find everything. – Antonio Jurado Apr 23 '23 at 11:29