-1

I am wondering if there is a way to normalize the phone number to the North American standard (1-222-333-4444) using regex pattern.

The string will take either "-", whitespace, "(", ")", and numbers only.

Thank you :)

Updated: All possible input are:

(123)-456-7890
123-456-7890
1-(123)-456-7890
1-123-456-7890
(123) 456-7890
123 456-7890
1-(123) 456-7890
1-123 456-7890
(123) 456 7890
123 456 7890
1 123 456 7890
1 (123) 456 7890

Code attempt:

public String convertPhone(String newPhone) {
    String regex = "^([\\(]{1}[0-9]{3}[\\)]{1}[ |\\-]{0,1}|^[0-9]{3}[\\-| ])?[0-9]{3}(\\-| ){1}[0-9]{4}$";
    Pattern pattern = Pattern.compile(regex);
    Matcher matcher = pattern.matcher(newPhone);
    if (matcher.matches()) {
        newPhone = matcher.replaceFirst("1 \\($1\\) $2-$3");
        return newPhone;
    } else {
        return "-1";
    }
}
Hovercraft Full Of Eels
  • 283,665
  • 25
  • 256
  • 373
Kevin
  • 31
  • 1
  • 9
  • All depends on the input you could get, meaning all ***possible*** input Strings that your code might receive, and the exact output you desire, neither of which are fully clear as yet. – Hovercraft Full Of Eels Nov 29 '19 at 23:28
  • Hi, I have updated the post. Thank you for pointing that out. – Kevin Nov 29 '19 at 23:34
  • Also note that I don't think that just regex will solve this, but a combination of regex and your own parsing code will. – Hovercraft Full Of Eels Nov 29 '19 at 23:35
  • Unfortunately, I do not know how to code-formatted text – Kevin Nov 29 '19 at 23:36
  • When editing, there should be a help link, I think in the upper right corner. Regardless, post your code, and we can help format it. – Hovercraft Full Of Eels Nov 29 '19 at 23:37
  • Hi, this regex pattern I got is from the Internet :). Basically, using this regex as a pattern and see if it matches, if it does, return the converted phone number, if not return "-1". – Kevin Nov 29 '19 at 23:44
  • Why do you call `replaceFirst()` with pattern `"1 \\($1\\) $2-$3"`, when your question says you want *"the North American standard (1-222-333-4444)"*, i.e. 3 dashes, no parentheses? – Andreas Nov 30 '19 at 00:39

3 Answers3

2

Maybe, an expression similar to,

(?:1[ -])?[(]?(\d{3})[)]?[ -](\d{3})[ -](\d{4})$

might cover the samples presented in the question, yet there'd probably be edge cases, such as any unexpected double space.

RegEx Demo

Test

import java.util.regex.Matcher;
import java.util.regex.Pattern;


public class RegularExpression{

    public static void main(String[] args){

        final String regex = "(?m)(?:1[ -])?[(]?(\d{3})[)]?[ -](\d{3})[ -](\d{4})$";
        final String string = "(123)-456-7890\n"
             + "123-456-7890\n"
             + "1-(123)-456-7890\n"
             + "1-123-456-7890\n"
             + "(123) 456-7890\n"
             + "123 456-7890\n"
             + "1-(123) 456-7890\n"
             + "1-123 456-7890\n"
             + "(123) 456 7890\n"
             + "123 456 7890\n"
             + "1 123 456 7890\n"
             + "1 (123) 456 7890";
        final String subst = "1-$1-$2-$3";

        final Pattern pattern = Pattern.compile(regex);
        final Matcher matcher = pattern.matcher(string);

        final String result = matcher.replaceAll(subst);

        System.out.println(result);


    }
}

Output

1-123-456-7890
1-123-456-7890
1-123-456-7890
1-123-456-7890
1-123-456-7890
1-123-456-7890
1-123-456-7890
1-123-456-7890
1-123-456-7890
1-123-456-7890
1-123-456-7890
1-123-456-7890

If you wish to simplify/update/explore the expression, it's been explained on the top right panel of regex101.com. You can watch the matching steps or modify them in this debugger link, if you'd be interested. The debugger demonstrates that how a RegEx engine might step by step consume some sample input strings and would perform the matching process.


RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Community
  • 1
  • 1
Emma
  • 27,428
  • 11
  • 44
  • 69
  • 1
    In regex101, the `(?m)` is needed to test all the possible inputs. In Java code, each input would be tested separately, so `(?m)` shouldn't be there (and the Java code should be a loop over inputs to test). – Andreas Nov 30 '19 at 00:35
  • 1
    Hi, thank you all for your help. This truly helped me a lot – Kevin Nov 30 '19 at 18:18
1

Why not just remove the non-numeric characters and then reformat the raw number based on the String length.

 String[] phoneNumbers = {
            "(123)-456-7890", "123-456-7890", "1-(123)-456-7890",
            "1-123-456-7890", "(123) 456-7890", "123 456-7890",
            "1-(123) 456-7890", "1-123 456-7890", "(123) 456 7890",
            "123 456 7890", "1 123 456 7890", "1 (123) 456 7890"
      };
      for (String phone : phoneNumbers) {
         String ph = phone.replaceAll("[\\(\\)\\- ]", "");

         if (ph.length() == 11) {
            ph = ph.substring(1);
         }
         String ac = ph.substring(0, 3);
         String exc = ph.substring(3, 6);
         String number = ph.substring(6);
         number = String.format("1 (%s) %s-%s", ac, exc, number);
         System.out.println(number);
      }
WJS
  • 36,363
  • 4
  • 24
  • 39
  • Although this works, it fails the [**defensive programming**](https://en.wikipedia.org/wiki/Defensive_programming) practices *("ensure the continuing function of a piece of software under unforeseen circumstances")*, i.e. it will clobber any unexpected input. If an invalid value is given, if may fail with `IndexOutOfBoundsException`, or it may silently discard extraneous digits. If a non-US phone number is entered, it will also mess up the foreign format. – Andreas Nov 30 '19 at 01:00
0

I think this can be done by simply looking for the digits while ignoring the country code and everything in between digit groups. This regex can handle all the examples and cases where there are extra spaces or any non-digit characters and no spaces between the digits.

1?\D*(\d{3})\D*(\d{3})\D*(\d{4})\b

Then you can replace the numbers with this pattern.

1-$1-$2-$3
Aaron de Windt
  • 16,794
  • 13
  • 47
  • 62