2

Say I have the regex

const string regex = "[A-Za-z0-9]* [0-9]{1,3} [A-Za-z]* ?[A-Za-z]*";

const string address = "ABC 123 Sesame Street"; // this is a valid match

and so far I have typed "ABC 123 Se".

As a human, I can see that the next character needs to be a letter. Is there an algorithm that can do that for a computer?

I have looked at Levenshtein Distance algorithms, but in order for those to provide information I need two strings, and I only have a string and a regex. Spell Checking algorithms don't quite match my situation either.

I would prefer a generic solution, so that if for some reason I need to allow 123 N 4567 W Paris, Idaho all I have to do is modify the regex.

Edit

I should have said, "as a human, I can see that the regex won't allow the next character to be a number or special character, so I can exclude those options." Thanks for catching that!

bwall
  • 984
  • 8
  • 22
  • As a human, can you see that the next character needs to be a letter after `ABC 123 Se` because you know what you are going to type is `s` then `a` then ...? If so, only a human would know what they are going to type (or mis-type) next. IE: `Se` could be the abbreviation `SE` for South East, but the user did not capitalize the letter `e` and maybe the next character could be a `.` or a space or... – Mark Stewart Sep 30 '19 at 17:40
  • 2
    Obligatory mention that the next character doesn't need to be a letter - it *could* be a space, or you could just stop there, and it would still match. – Nick Reed Sep 30 '19 at 17:48
  • @MarkStewart Sorry for the confusion. I suppose I should have said "As a human, I can see that the regex requires the next letter to be either a letter or a space" so I could exclude numbers/special characters from my list of options. – bwall Sep 30 '19 at 17:53
  • No problem; just wanted to clarify your expectations. – Mark Stewart Sep 30 '19 at 19:54

1 Answers1

0

According to this question, it is possible, you just have to be clever about the regex's you use. For example, if you are parsing IPs:

List<string> validNextOptions = new List<string>();
string currentString = "255.3";
string newCharacter = "2";
string partialIP = "^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])[.]){0,3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])?$";
Regex partialIpRegex = new Regex(partialIP);

if(partialIpRegex.IsMatch(currentString + newCharacter))
{
    validNextOptions.Add(newCharacter);
}

This regex will return a match as long as you are progressing toward a valid IP. If you are unfamiliar with how regex's work, I reccomend you post the particularIP string into something like regex101.com and play with it a bit.

bwall
  • 984
  • 8
  • 22
  • Depending on your implementation, it might be more efficient to use a string builder instead of simply concatenating strings. – bwall Oct 08 '19 at 23:36