6

I would like to validate some JSF's fields using Validators. I am searching for a pattern to validate a string which has only these characters:

AÁÂÄBCÇDEÉÈÊËFGHIÎÏJKLMNOÔÖPQRSTUÛÜÙVWXYZaàâäbcçdeéèêëfghiîïjklmnoôöpqrstuûüùvwxyÿz

I am using this pattern [A-Za-zÀ-ÿ]*\s\'\- but it doesn't work:

My code:

public Boolean isValid(String str) throws PatternSyntaxException {

    Pattern pattern = Pattern.compile("[A-Za-zÀ-ÿ]*\s\'\-");
    Matcher matcher = pattern.matcher(str);
    Boolean result = matcher.matches();

    return result;
}
Abraham
  • 8,525
  • 5
  • 47
  • 53
Abderrahim
  • 651
  • 2
  • 11
  • 25

3 Answers3

3

There is already a class for Latin script: \p{IsLatin}

Pattern p = Pattern.compile("[-' \\p{IsLatin}]*");
Matcher m = p.matcher("AÁÂÄBCÇDEÉÈÊËFGHIÎÏJKLMNOÔÖPQRSTUÛÜÙVWXYZaàâäbcçdeéèêëfghiîïjklmnoôöpqrstuûüùvwxyÿz '-");
if(m.matches()) {
    System.out.println("Success!");
}

I would not advise you to use the range À-ÿ unless you want to restrict yourself to the ASCII extended character set. Another problem with that range is that it includes non-Latin characters such as the obelus (÷).

(Aside from that, there was also the problem with your backslash syntax and failing to put everything inside the repeating character class. However, that's already been covered by other answers so I won't repeat what they already said.)

Patrick Parker
  • 4,863
  • 4
  • 19
  • 51
  • I am affraid of \p{IsLatin} if it could accept other characteres wich are not in "AÁÂÄBCÇDEÉÈÊËFGHIÎÏJKLMNOÔÖPQRSTUÛÜÙVWXYZaàâäbcçdeéèêëfghiîïjklmnoôöpqrstuûüùvwxyÿz '-" – Abderrahim Aug 08 '18 at 08:59
  • @Abderrahim as I stated the range À-ÿ also includes other characters, so if you want precisely those characters then you should use sln's answer. – Patrick Parker Aug 08 '18 at 09:42
2

If the sequence in which characters are to be found is not an issue than below code might work for you

public Boolean isValideName(String nom) throws PatternSyntaxException {
    Pattern pattern = Pattern.compile("[A-Za-zÀ-ÿ '-]*");
    Matcher matcher = pattern.matcher(nom);
    boolean result = matcher.matches();
    return result;
}

Or you can also write a pattern with negation, which would find a single pattern which doesn't match these characters. Something like below might also help

public Boolean isValideName(String nom) throws PatternSyntaxException {
    Pattern pattern = Pattern.compile("[^A-Za-zÀ-ÿ '-]*");
    Matcher matcher = pattern.matcher(nom);
    boolean result = matcher.matches();
    return !result;
}

[^A-Za-zÀ-ÿ '-]* Here we are searching for any character that is not in the list, including space

Ashishkumar Singh
  • 3,580
  • 1
  • 23
  • 41
0

You could use either

[\u0020'\-A-Za-zÁ-ÂÄÇ-ËÎ-ÏÔÖÙÛ-Üàâäç-ëî-ïôöùû-üÿ]

or

[\u0020\u0027\u002D\u0041-\u005A\u0061-\u007A\u00C1-\u00C2\u00C4\u00C7-\u00CB\u00CE-\u00CF\u00D4\u00D6\u00D9\u00DB-\u00DC\u00E0\u00E2\u00E4\u00E7-\u00EB\u00EE-\u00EF\u00F4\u00F6\u00F9\u00FB-\u00FC\u00FF]

Used a tool to make the ranges http://www.regexformat.com/scrn8/Uunique.jpg
then pressed the Hex conversion button to get the class.

Then converted the hex back to Unicode chars using the conversion tool http://www.regexformat.com/scrn8/MegConv.jpg