4

So currently I've got the following regex pattern, allowing me to detect any string containing 9 characters that are the same consecutively.

/^.*(\S)\1{9,}.*$/

This works perfectly with a string like the following: this a tesssssssssst however I wish for it to also detect a string like this: this a tess sss ssssst (Same number of the repeated character, but with optional whitespace)

Any ideas?

Matt Cowley
  • 2,164
  • 2
  • 18
  • 29

2 Answers2

4

You need to put the backreference into a group and add an optional space into the group:

^.*(\S)(?: ?\1){9,}.*$

See the regex demo. If there can be more than 1 space in between, replace ? with *.

The .*$ part is only needed if you need to get the whole line match, for methods that allow partial matches, you may use ^.*(\S)(?: ?\1){9,}.

If any whitespace is meant, replace the space with \s in the pattern.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Further question if you could: Detect any character combination repeated 9 or more time with spaces between? – Matt Cowley Jun 20 '17 at 17:28
  • Please provide sone example of real data and expected output. – Wiktor Stribiżew Jun 20 '17 at 17:45
  • Indeed, as the answer said @revo – Matt Cowley Jun 20 '17 at 17:48
  • @MattCowley: If you want to make a backreference not take into account whitespace, it is not possilbe. A backreference only refers to the value captured with the respective capturing group, and there is only a way to make it match in a case insensitive way. There is no "whitespace" insensitive modifier. You could match that with some hardcoded regex, like [this one](https://regex101.com/r/CfQzsd/1), but that is not what you want here. – Wiktor Stribiżew Jun 20 '17 at 18:49
  • I can't find a way to figure out why @Matt's [regex](http://regexr.com/3g6ua) matches in JS. I don't think it should. – revo Jun 20 '17 at 18:59
  • Well, acc. to https://regex101.com/r/l9UtaN/1, there must be 2 matches. And that is as expected. But `this a tes t es tes tes tes tes t es tes tes tes t` cannot be matched with a pattern of this type. – Wiktor Stribiżew Jun 20 '17 at 20:16
0

You can check more than a single character this way.
It's only limited by the number of capture groups available.

This one checks for 1 - 3 characters.

(\S)[ ]*(\S)?[ ]*(\S)?(?:[ ]*(?:\1[ ]*\2[ ]*\3)){9,}

http://regexr.com/3g709

 # 1-3 Characters
 ( \S )                        # (1)
 [ ]* 
 ( \S )?                       # (2)
 [ ]* 
 ( \S )?                       # (3)
 # Add more here

 (?:
      [ ]* 
      (?: \1 [ ]* \2 [ ]* \3 )
      # Add more here
 ){9,}