How to remove repeated sequence of characters in a string?

Question

Imagine if:

$string = "abcdabcdabcdabcdabcdabcdabcdabcd";

How do I remove the repeated sequence of characters (all characters, not just alphabets) in the string so that the new string would only have "abcd"? Perhaps running a function that returns a new string with removed repetitions.

$new_string = remove_repetitions($string);

The possible string before removing the repetition is always like above. I don’t know how else to explain since English is not my first language. Other examples are,

$string = “EqhabEqhabEqhabEqhabEqhab”;
$string = “o=98guo=98guo=98gu”;

Note that I want it to work with other sequence of characters as well. I tried using Regex but I couldn't figure out a way to accomplish it. I am still new to php and Regex.

You need to provide more cases to clear the context. What should be the output for `bcdaabcdjdfgabcd` ? — nice_dev, Jun 02 '19 at 14:10
this is definitely possible but, it is a multi stage regex meaning you need to craft some method at determining when a string is repetitive. if it doesn't repeat you do nothing if it does repeat you need to capture the substring. Once you have the substring you can divide up your whole string into capture groups and delete all but the first. Another question is how many characters will you consider as a substring. 4 or 20. How long a string you consider as a repeat will determine how difficult this will be. This definitely is not a beginner task. — Chris Richardson, Jun 02 '19 at 14:40

score 0 · Accepted Answer · answered Jun 02 '19 at 14:43

For details : https://algorithms.tutorialhorizon.com/remove-duplicates-from-the-string/

In different programming have a different way to remove the same or duplicate character from a string. Example: In PHP

<?php
$str = "Hello World!";
echo count_chars($str,3);
?>

OutPut : !HWdelor https://www.w3schools.com/php/func_string_count_chars.asp

score 0 · Answer 2 · answered Jun 02 '19 at 17:32

Here, if we wish to remove the repeating substrings, I can't think of a way other than knowing what we wish to collect since the patterns seem complicated.

In that case, we could simply use a capturing group and add our desired output in it the remove everything else:

(abcd|Eqhab|guo=98)

I'm guessing it should be simpler way to do this though.

Test

$re = '/.+?(abcd|Eqhab|guo=98)\1.+/m';
$str = 'abcdabcdabcdabcdabcdabcdabcdabcd
EqhabEqhabEqhabEqhabEqhab
o98guo=98guo=98guo=98guo=98guo=98guo=98guo98';
$subst = '$1';

$result = preg_replace($re, $subst, $str);

echo $result;

Demo

score 0 · Answer 3 · answered Jun 02 '19 at 17:35

0

You did not tell what exactly to remove. A "sequnece of characters" can be as small as just 1 character.

So this simple regex should work

preg_replace ( '/(.)(?=.*?\1)/g','' 'abcdabcdabcdabcdabcdabcd');

answered Jun 02 '19 at 17:35

Skeeve

7,188
2
16
26

How to remove repeated sequence of characters in a string?

3 Answers3

Test

Demo