What's the best way to remove duplicates from a string in PHP (or any language)?

Question

I am looking for the best known algorithm for removing duplicates from a string. I can think of numerous ways of doing this, but I am looking for a solution that is known for being particularly efficient.

Let's say you have the following strings:

Lorem Ipsum Lorem Ipsum
Lorem Lorem Lorem
Lorem Ipsum Dolor Lorem Ipsum Dolor Lorem Ipsum Dolor

I would expect this algorithm to output for each (respectively):

Lorem Ipsum
Lorem
Lorem Ipsum Dolor

Note, I am doing this in PHP, in case anybody is aware of any built in PHP functions that can help with this.

Thanks!

What if you had "Lorem Ipsum Ipsum Dolor Ipsum"? Would you want the output to be "Lorem Ipsum Dolor"? — Anthony, Mar 16 '11 at 20:01
Nope, not removing duplicate words, just duplicate patterns of words — chaimp, Mar 16 '11 at 20:35

score 6 · Accepted Answer · answered Mar 16 '11 at 20:04

6

$arr = explode( " " , $string );
$arr = array_unique( $arr );
$string = implode(" " , $arr);

answered Mar 16 '11 at 20:04

AbiusX

2,379
20
26

Thanks, the answer is actually very elegant. – chaimp Mar 18 '11 at 04:10
would be like `implode(" ",array_unique(explode(" ", $string )))` in signle line ;) – Shashank Shah Feb 05 '21 at 09:50

score 2 · Answer 2 · answered Mar 16 '11 at 20:03

2

Dunno about efficiency, but maybe this can do:

$str = implode(" ", array_unique(explode(" ", $str)));

answered Mar 16 '11 at 20:03

Mārtiņš Briedis

17,396
5
54
76

score 2 · Answer 3 · answered Mar 16 '11 at 20:04

2

$words = array_unique(explode(' ',$text));
echo implode(' ',$words);

if you want to make it better you can use preg_split with \s\W for exploding words

answered Mar 16 '11 at 20:04

dynamic

46,985
55
154
231

score 1 · Answer 4 · answered Mar 16 '11 at 20:01

1

Best way of doing it:

Sort the words inside string
Remove duplicates by iterating the sorted words

Other possibility is using a set construction if your language supports it.

answered Mar 16 '11 at 20:01

Pablo Santa Cruz

176,835
32
241
292

1

This is a good answer, but requires the extra step of putting the string back into it's original order. – chaimp Mar 16 '11 at 20:36

PCMShaper · Answer 5 · 2016-11-09T10:18:46.030

0

You can try below code for removing duplicate code from any sentence

$arr = explode(" " , $string);
$arr = preg_replace('/(\w{2,})(?=.*?\\1)\W*/', '', $arr);
$string = implode(" " , $arr);

edited Nov 09 '16 at 10:18

answered Nov 09 '16 at 10:13

PCMShaper

54
5

What's the best way to remove duplicates from a string in PHP (or any language)?

5 Answers5

Linked