0

I have an Excel sheet with two columns (A & B) each cell containing text, I would like to be able create a third column (C) with the characters that are different between the two.

There are 50,000 rows to go through.

An Example:

Example table in Excel

Another Example:

Column A2: ABCDEFG
Column B2: AB"CDEFGED
Column C2: "ED (The difference between the 2)

Is this possible? Working on Excel Mac 2019

E_net4
  • 27,810
  • 13
  • 101
  • 139
PaulW.
  • 13
  • 3
  • I removed it since it actually didn't work =(. It was missing a character in `C3` for the exact reason I mentioned in my post. Or does your actual data not reflect your sample data? Also there is some flaw in your sample data. Is it true that you are not interested in the single quote in `A3`? – JvdV May 27 '20 at 18:23
  • It represents the data and seems to work perfectly with what I have checked so far. The two columns contain names and there are disparities between two datasets mostly due bad data entry. 90% the differences are between LTD and Limited, though there are , " ' thrown in everywhere. These are fine, it is when one name in one column is completely different to the one the next, that i have problems. This picks this out perfectly, so again thank you. After this 50,000 another 200,000 lines to go : ( – PaulW. May 27 '20 at 18:34
  • I'll undelete the answer, for you to pick up the mistake at cell `C3`... I'll see meanwhile if I can come up with something else. – JvdV May 27 '20 at 18:35
  • Is VBA an option for you, using an UDF to solve this issue? Much easier. – JvdV May 28 '20 at 07:43

1 Answers1

1

Would be interested to see if anyone can come up with a formula doing this. I, isntead, went a different route using an UDF:

Function Difference(str1 As String, str2 As String) As String

Dim x As Long
Dim tmp As String

'Grab chars from str2
tmp = str1
For x = 1 To Len(str2)
    tmp = Replace(tmp, Mid(str2, x, 1), "", Count:=1)
Next
Difference = tmp

'Grab chars from str1
tmp = str2
For x = 1 To Len(str1)
    tmp = Replace(tmp, Mid(str1, x, 1), "", Count:=1)
Next
Difference = Difference & tmp

End Function

Call from C2 through: =Difference(A2,B2)

enter image description here

JvdV
  • 70,606
  • 8
  • 39
  • 70
  • @ScottCraner, I'm aware, as you can read in above comments I had this deleted, yet OP says it works. Undeleted to show him it actually is faulty – JvdV May 27 '20 at 18:38
  • 1
    @PaulW, please, check the data! This is NOT correct, I merely undeleted for you to pick up the mistake in `C3`. – JvdV May 27 '20 at 18:44
  • Sorry had to eat something quickly been looking at this mess of data all day. I can see the error, if the letter exists within the first cell it does not count the duplicate in the second. Is there a way around this? Thank you all again for your help – PaulW. May 27 '20 at 19:31
  • Maybe, but I can't think of any ATM. I'll delete this now – JvdV May 27 '20 at 19:33
  • @PaulW., I went a different route, you can add this UDF and it should cover your needs. – JvdV May 28 '20 at 07:53
  • @ScottCraner, I believe I hit the limits of my capabilities trying to solve that through a "simple" formula. Would be interesting to see if it's possible. Instead I went the chicken way out using UDF. – JvdV May 28 '20 at 07:54
  • I tried for a while and gave up. I do not believe a formula can do it. – Scott Craner May 28 '20 at 13:19