2

I need a regex that checks if a string only contain letters(a-z) and that the first letter is uppercase, you cant have 2 letters in a word uppercase Like: THomas or THomAS but Thomas Anderson (Thomas anderson too) would be valid

look:

The Magician Of The Elfs would be valid but not ThE MaGiCiAN oF ThE ELFS

if (!preg_match("??", $name)) {
   echo "Invalid name!";
}

hope you understand!

Tomasz

Invalid:

MaGIciaN Of The ELFz
THomas anderson

Valid:

Magician of the elfs
Magician Of the Elfs
Magician of The elfs
Thomas Anderson
Thomas anderson

Basically i dont want it to be possible to have more than 1 capitalized letter in a word, not sentence.

Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
Tomasz
  • 23
  • 3
  • 2
    Are you really only looking for the range `a-z`? Note that the character class `[a-z]` does *not* have the `é` in it, for example. – Bart Kiers Jan 17 '10 at 12:49
  • Instead of rejecting the invalid names, how about fixing them? You can do $string=ucwords(strtolower($string)); to make the invalid input fit the style you want. – JAL Jan 17 '10 at 12:57
  • The space character in the valid example is not in the a-z range. – mpez0 Jan 17 '10 at 13:08
  • 1
    Out of interest, what are you trying to achieve? I hope you're aware that not all names are spelled a single leading capital for each word. Eg: Ronald McDonald. You can make a guess, but that's all it will be. – John Carter Jan 17 '10 at 13:32
  • Just a side note, if this is an anti-"pokemon" solution, your solution will ban Roman numbers (XXVII) and acronyms (UNO) as well. Maybe it's worth allowing ALL CAPITALS, too. – naivists Jan 17 '10 at 13:40

5 Answers5

5
'/^[A-Z][a-z]+( [A-Z][a-z]+)*$/'

Untested, though.

EDIT Oh, perhaps I misread your question. The above assumes a minimum word length of two. Is "John A" or "A Horse" valid? In that case: '/^[A-Z][a-z]*( [A-Z][a-z]*)*$/'.


According to updated requirements:

'/^[A-Z][a-z]*( [A-Za-z][a-z]*)*$/'

Validates one capitalized letter followed by any number of lowercase letters. After this, any number of the sequence: space, (possibly an uppercase letter), any number of lowercase letters (at least one letter in total for each space).

jensgram
  • 31,109
  • 6
  • 81
  • 98
3

You can also describe the character by its Unicode character properties:

/^\p{Lu}\p{Ll}*(?:\s+\p{Lu}\p{Ll}*)*$/

Edit    Since you changed your requirements, try this regular expression:

/^[\p{Lu}\p{Ll}]\p{Ll}*(?:\s+[\p{Lu}\p{Ll}]\p{Ll}*)*$/

Now the first character or each word can be an uppercase letter or a lowercase letter.

NullUserException
  • 83,810
  • 28
  • 209
  • 234
Gumbo
  • 643,351
  • 109
  • 780
  • 844
1
$str = "ThE MaGiCiAN oF ThE ELFS";
$s = explode(" ",$str);
foreach ($s as $k){
    if ( ! (preg_match("/^[A-Z][a-z]+/",$k) )){
        print "$str does not match.\n";
        break;
    }
}
ghostdog74
  • 327,991
  • 56
  • 259
  • 343
0

Wow. \b, guys.

if matches /\B[A-Z]/ then invalid

or, to be Unicode aware,

if matches /\B\p{Lu}/ then invalid

You may want to ensure the whole string matches /^[\p{Lu}\p{Ll}\s]$/ first, to avoid leaving strings like The (Magic) Elf as valid.

kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
  • So you're proposing to do it in two steps? In that case, I prefer Gumbo and jensgram's solutions (just one fairly intuitive step). If you don't perform the second validation, not just `The (Magic) Elf` would pass, but also strings like `????????` or `11111111122222222` etc. – Bart Kiers Jan 17 '10 at 13:54
  • That depends on whether the input already is already filtered, because the asker's example doesn't include these cases. (And updated to further simplify the test.) – kennytm Jan 17 '10 at 14:00
  • In fact, `????????` and `1111111122222222` satisfies the basic requirement "(not) to have more than 1 capitalized letter in a word". There is no words nor capitalized letters in the string, so it passes. – kennytm Jan 17 '10 at 14:04
  • Nothing in the "valid examples" suggests these are valid, but sure: they could be. I wouldn't put money on it though! :) – Bart Kiers Jan 17 '10 at 14:10
  • Haha, that's why a precise specification is costly :p – kennytm Jan 17 '10 at 14:13
0
if (preg_match("\w[A-Z]", $name)) {
    echo "invalid name!";
}
david
  • 3,225
  • 9
  • 30
  • 43