I have a vCard file with records for thousands of contacts. This file has been corrupted and copies of personal phone, work, and extra records have been added for each of the users.
How could I clean up duplicates?
BEGIN:VCARD
VERSION:3.0
N:Doe;John;Q.,Public
FN;CHARSET=UTF-8:John Doe
TEL;TYPE=WORK,VOICE:(111) 555-1212
TEL;TYPE=WORK,VOICE:(111) 555-1212
TEL;TYPE=WORK,VOICE:(111) 555-1212
TEL;TYPE=WORK,VOICE:(111) 555-1212
TEL;TYPE=HOME,VOICE:(404) 555-1212
TEL;TYPE=HOME,VOICE:(404) 555-1212
TEL;TYPE=HOME,VOICE:(404) 555-1212
TEL;TYPE=HOME,TYPE=VOICE:(404) 555-1213
TEL;TYPE=HOME,TYPE=VOICE:(404) 555-1213
TEL;TYPE=HOME,VOICE:(404) 555-1212
TEL;TYPE=HOME,VOICE:(404) 555-1212
TEL;TYPE=HOME,VOICE:(404) 555-1212
TEL;TYPE=HOME,TYPE=VOICE:(404) 555-1213
TEL;TYPE=HOME,TYPE=VOICE:(404) 555-1213
TEL;TYPE=HOME,TYPE=VOICE:(404) 555-1213
TEL;TYPE=HOME,TYPE=VOICE:(404) 555-1213
EMAIL;TYPE=PREF,INTERNET:forrestgump@example.com
EMAIL;TYPE=INTERNET:example@example.com
EMAIL;TYPE=PREF,INTERNET:forrestgump@example.com
EMAIL;TYPE=PREF,INTERNET:forrestgump@example.com
EMAIL;TYPE=PREF,INTERNET:forrestgump@example.com
EMAIL;TYPE=PREF,INTERNET:forrestgump@example.com
EMAIL;TYPE=INTERNET:example@example.com
EMAIL;TYPE=INTERNET:example@example.com
EMAIL;TYPE=INTERNET:example@example.com
EMAIL;TYPE=INTERNET:example@example.com
EMAIL;TYPE=INTERNET:example@example.com
EMAIL;TYPE=PREF,INTERNET:forrestgump@example.com
EMAIL;TYPE=PREF,INTERNET:forrestgump@example.com
EMAIL;TYPE=PREF,INTERNET:forrestgump@example.com
EMAIL;TYPE=PREF,INTERNET:forrestgump@example.com
EMAIL;TYPE=PREF,INTERNET:forrestgump@example.com
ADR;TYPE=HOME:;;42 Plantation St.;Baytown;LA;30314;United States of America
URL:https://www.google.com/
PHOTO;VALUE=URL;TYPE=PNG:http://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Example_svg.svg/200px-Example_svg.svg.png
AGENT:BEGIN:VCARD
VERSION:3.0
N:Doe;John;Q.,Public
FN:John Doe
TEL;TYPE=WORK,VOICE:(111) 555-1212
TEL;TYPE=HOME,VOICE:(404) 555-1212
TEL;TYPE=HOME,TYPE=VOICE:(404) 555-1213
EMAIL;TYPE=PREF,INTERNET:forrestgump@example.com
EMAIL;TYPE=INTERNET:example@example.com
PHOTO;VALUE=URL;TYPE=PNG:http://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Example_svg.svg/200px-Example_svg.svg.png
END:VCARD
END:VCARD
I have used the following solution seen in StackOverflow, but it has not solved the problem since not all duplicates appear consecutively.
perl -ne 'print unless (defined($prev) && ($_ eq $prev)); $prev=$_'
resulting in:
...
TEL;TYPE=WORK,VOICE:(111) 555-1212
TEL;TYPE=HOME,TYPE=VOICE:(404) 555-1213
TEL;TYPE=WORK,VOICE:(111) 555-1212
TEL;TYPE=HOME,TYPE=VOICE:(404) 555-1213
TEL;TYPE=WORK,VOICE:(111) 555-1212
TEL;TYPE=HOME,TYPE=VOICE:(404) 555-1213
EMAIL;TYPE=PREF,INTERNET:forrestgump@example.com
EMAIL;TYPE=INTERNET:example@example.com
EMAIL;TYPE=PREF,INTERNET:forrestgump@example.com
EMAIL;TYPE=INTERNET:example@example.com
EMAIL;TYPE=PREF,INTERNET:forrestgump@example.com
EMAIL;TYPE=INTERNET:example@example.com