4

Can anyone help me please? I am working on validation linked in url in java. I create a regular expression(which I am not familiar with) to validate the link however I am struggling with it.

The code is as follows:

public class TestRegEx {

    public static void main(final String[] args) {

        // List of valid URLs
        List<String> validValues = new ArrayList<>();
        validValues.add("https://www.linkedin.com/sometext");
        validValues
                .add("https://uk.linkedin.com/in/wiliam-ferraciolli-a9a29795");
        validValues.add("https://it.linkedin.com/hp/");
        validValues.add("https://cy.linkedin.com/hp/");
        validValues
                .add("https://www.linkedin.com/profile/view?id=AAIAABQnNlYBIx8EtS5T1RTUbxHQt5Ww&trk=nav_responsive_tab_profile");


        // List on invalid URLs
        List<String> invalidValues = new ArrayList<>();
        invalidValues.add("http://www.linkedin.com/sometext");
        invalidValues.add("http://stackoverflow.com/questions/ask");
        invalidValues.add("google.com");
        invalidValues.add("http://uk.linkedin.com/in/someDodgeAddress");
        invalidValues.add("http://dodge.linkedin.com/in/someDodgeAddress");

        // Pattern
        String regex = "(https://)(.+)(www.)(.+)$";
        Pattern pattern = Pattern.compile(regex);

        for (String s : validValues) {
            Matcher matcher = pattern.matcher(s);
            System.out.println(s + " // " + matcher);
        }

    }
}

Can anyone help me to create a regex to validate the following Prefix: "https://" Optional prefix: "uk." (it can be nothing or another country) Middle: "linkedin.com/" Suffix: "any characters with a max of 200 chars"

Regards

Wil Ferraciolli
  • 449
  • 3
  • 9
  • 21

2 Answers2

10

I would go with:

^https:\\/\\/[a-z]{2,3}\\.linkedin\\.com\\/.*$

LiveDemo


^ assert position at start of a line
https: matches the characters https: literally (case insensitive)
\/ matches the character / literally
\/ matches the character / literally
[a-z]{2,3} match a single character present in the list below

    Quantifier: {2,3} Between 2 and 3 times, as many times as possible, giving back as needed [greedy]
    a-z a single character in the range between a and z (case insensitive)

\. matches the character . literally
linkedin matches the characters linkedin literally (case insensitive)
\. matches the character . literally
com matches the characters com literally (case insensitive)
\/ matches the character / literally
.* matches any character (except newline)

    Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]

$ assert position at end of a line
Thomas Ayoub
  • 29,063
  • 15
  • 95
  • 142
  • Thank you. It works fine using regex101, however when I add it to Eclipse a String it throws me an error "Invalid escape sequence". Is it something i am missing? – Wil Ferraciolli Nov 17 '15 at 15:27
  • @WilFerraciolli fixed :) see the [diff](http://stackoverflow.com/posts/33760587/revisions). It's because `\ ` have to be escaped to work well – Thomas Ayoub Nov 17 '15 at 15:29
  • 1
    I doubt you have to escape the `/` symbol. As a rule of thumb, place all the "special" characters into `[]`. Try `^https://[a-z]{2,3}[.]linkedin[.]com/.*$` – Wiktor Stribiżew Nov 17 '15 at 15:34
  • Nice one. Thank you for explaining. – Wil Ferraciolli Nov 17 '15 at 15:36
  • 2
    @Thomas: Please refer to [character classes reference](http://www.regular-expressions.info/charclass.html), especially *Metacharacters Inside Character Classes* section. – Wiktor Stribiżew Nov 17 '15 at 15:41
4

updated regex

the link of profile changes in /in

http(s)?:\/\/([\w]+\.)?linkedin\.com\/in\/[A-z0-9_-]+\/?

  • 1
    Code only answers can almost always be improved by adding some explanation. In this case please edit the answer to include why there was an update needed, what was updated, and how the regex accomplishes that. – Jason Aller Mar 11 '20 at 23:52