37

What is the common way in Java to validate and convert a string of the form host:port into an instance of InetSocketAddress?

It would be nice if following criteria were met:

  • No address lookups;

  • Working for IPv4, IPv6, and "string" hostnames;
    (For IPv4 it's ip:port, for IPv6 it's [ip]:port, right? Is there some RFC which defines all these schemes?)

  • Preferable without parsing the string by hand.
    (I'm thinking about all those special cases, when someone think he knows all valid forms of socket addresses, but forgets about "that special case" which leads to unexpected results.)

java.is.for.desktop
  • 10,748
  • 12
  • 69
  • 103

8 Answers8

59

I myself propose one possible workaround solution.

Convert a string into URI (this would validate it automatically) and then query the URI's host and port components.

Sadly, an URI with a host component MUST have a scheme. This is why this solution is "not perfect".

String string = ... // some string which has to be validated

try {
  // WORKAROUND: add any scheme to make the resulting URI valid.
  URI uri = new URI("my://" + string); // may throw URISyntaxException
  String host = uri.getHost();
  int port = uri.getPort();

  if (uri.getHost() == null || uri.getPort() == -1) {
    throw new URISyntaxException(uri.toString(),
      "URI must have host and port parts");
  }

  // here, additional checks can be performed, such as
  // presence of path, query, fragment, ...


  // validation succeeded
  return new InetSocketAddress (host, port);

} catch (URISyntaxException ex) {
  // validation failed
}

This solution needs no custom string parsing, works with IPv4 (1.1.1.1:123), IPv6 ([::0]:123) and host names (my.host.com:123).

Accidentally, this solution is well suited for my scenario. I was going to use URI schemes anyway.

Aleksandr Dubinsky
  • 22,436
  • 15
  • 82
  • 99
java.is.for.desktop
  • 10,748
  • 12
  • 69
  • 103
  • 3
    Note that is also works with quite a few others things, e.g.: "my://foo:bar:baz/" or "my://something@foo.bar:8000/" and so on... which may or may not be a problem in your case but doesn't really satisfy the original question's desire to avoid things that "lead to unexpected results". :) – PSpeed Feb 27 '10 at 16:18
  • This is not working properly for IPv6 address w/o port. `String host = "fc00::142:10"; System.out.println(new URI("my://" + host).getHost());` This code will print null. – Dikla Oct 03 '17 at 07:49
8

A regex will do this quite neatly:

Pattern p = Pattern.compile("^\\s*(.*?):(\\d+)\\s*$");
Matcher m = p.matcher("127.0.0.1:8080");
if (m.matches()) {
  String host = m.group(1);
  int port = Integer.parseInt(m.group(2));
}

You can this in many ways such as making the port optional or doing some validation on the host.

cletus
  • 616,129
  • 168
  • 910
  • 942
  • Note, the '^' and '$' are unnecessary in this case as matches() must match the entire string anyway. – PSpeed Feb 27 '10 at 09:51
  • That pattern is way too permissive, even once the leading and trailing white space is dumped. The port part, for instance, must be in the range 0 through 65535 inclusive, and zero can lead only in the 0 case. The pattern for just the port part is this abomination: `"((0)|([1-9]\\d{0,2})|([1-5]\\d{3})|([6-9]\\d{3})|([1-5]\\d{4})|(6[0-4]\\d{3})|(65[0-4]\\d{2})|(655[0-2]\\d)|(6553[0-5]))"` – Urhixidur Jun 13 '17 at 18:21
7

It doesn't answer the question exactly, but this answer could still be useful others like me who just want to parse a host and port, but not necessarily a full InetAddress. Guava has a HostAndPort class with a parseString method.

Edward Dale
  • 29,597
  • 13
  • 90
  • 129
  • 1
    Be careful with Guava's `HostAndPort`. It will not do a strict validation on hostname. Please check its class doc or source code. – stanleyxu2005 Jan 05 '15 at 02:04
  • I noticed also that HostAndPort is very useful, but I found myself calling InetAddresses.forString(hostIp) solely to validate the host. Makes no sense that HostAndPort validates the Port is ok but not the Host! – titania424 Oct 07 '16 at 18:14
6

Another person has given a regex answer which is what I was doing to do when originally asking the question about hosts. I will still do because it's an example of a regex that is slightly more advanced and can help determine what kind of address you are dealing with.

String ipPattern = "(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}):(\\d+)";
String ipV6Pattern = "\\[([a-zA-Z0-9:]+)\\]:(\\d+)";
String hostPattern = "([\\w\\.\\-]+):(\\d+)";  // note will allow _ in host name
Pattern p = Pattern.compile( ipPattern + "|" + ipV6Pattern + "|" + hostPattern );
Matcher m = p.matcher( someString );
if( m.matches() ) {
    if( m.group(1) != null ) {
        // group(1) IP address, group(2) is port
    } else if( m.group(3) != null ) {
        // group(3) is IPv6 address, group(4) is port            
    } else if( m.group(5) != null ) {
        // group(5) is hostname, group(6) is port
    } else {
        // Not a valid address        
    }
}

Modifying so that port is optional is pretty straight forward. Wrap the ":(\d+)" as "(?::(\d+))?" and then check for null for group(2), etc.

Edit: I'll note that there's no "common way" way that I'm aware of but the above is how I'd do it if I had to.

Also note: the IPv4 case can be removed if the host and IPv4 cases will actually be handled the same. I split them out because sometimes you can avoid an ultimate host look-up if you know you have the IP address.

Roy Sharon
  • 3,488
  • 4
  • 24
  • 34
PSpeed
  • 3,346
  • 20
  • 12
2
new InetSocketAddress(
  addressString.substring(0, addressString.lastIndexOf(":")),
  Integer.parseInt(addressString.substring(addressString.lastIndexOf(":")+1, addressString.length));

? I probably made some little silly mistake. and I'm assuming you just wanted a new InetSocketAddress object out of the String in only that format. host:port

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
AFK
  • 4,333
  • 4
  • 23
  • 22
  • 5
    this would fail for IPv6, because it is something like `[2001:db8:85a3::8a2e:370:7334]:80` – java.is.for.desktop Feb 26 '10 at 21:58
  • Perhaps my question is wrong. Is an IP address also a host? I don't know. – java.is.for.desktop Feb 26 '10 at 21:59
  • you're right this would fail for IPv6. Also the host's address is an IP Address. I guess you would just have to put in an if statement before this one and create an InetSocket based on whether or not the IP address is v6 or v4 – AFK Feb 26 '10 at 22:26
  • 2
    ...or use lastIndexOf() instead of indexOf()... Though I don't know what InetSocketAddress is expecting for IPv6. – PSpeed Feb 26 '10 at 23:23
  • Also note: the post strong should be lastIndexOf(':') + 1 and no second parameter is required. – PSpeed Feb 26 '10 at 23:26
  • And it fails on an IPv6 address if a port number hasn't been supplied.e.g. "[2001:db8:85a3::8a2e:370:7334]", So not a good idea if you're going to unleash this code on user-supplied input. Which, when you think it about it is pretty much guaranteed to be the case if the port number isn't hard-coded. – Robin Davies Jul 22 '18 at 20:35
0

All kind of peculiar hackery, and elegant but unsafe solutions provided elsewhere. Sometimes the inelegant brute-force solution is the way.

public static InetSocketAddress parseInetSocketAddress(String addressAndPort) throws IllegalArgumentException {
    int portPosition = addressAndPort.length();
    int portNumber = 0;
    while (portPosition > 1 && Character.isDigit(addressAndPort.charAt(portPosition-1)))
    {
        --portPosition;
    }
    String address;
    if (portPosition > 1 && addressAndPort.charAt(portPosition-1) == ':')
    {
        try {
            portNumber = Integer.parseInt(addressAndPort.substring(portPosition));
        } catch (NumberFormatException ignored)
        {
            throw new IllegalArgumentException("Invalid port number.");
        }
        address = addressAndPort.substring(0,portPosition-1);
    } else {
        portNumber = 0;
        address = addressAndPort;
    }
    return new InetSocketAddress(address,portNumber);
}
Robin Davies
  • 7,547
  • 1
  • 35
  • 50
0

The open-source IPAddress Java library has a HostName class which will do the required parsing. Disclaimer: I am the project manager of the IPAddress library.

It will parse IPv4, IPv6 and string host names with or without ports. It will handle all the various formats of hosts and addresses. BTW, there is no single RFC for this, there are a number of RFCs that apply in different ways.

String hostName = "[a:b:c:d:e:f:a:b]:8080";
String hostName2 = "1.2.3.4:8080";
String hostName3 = "a.com:8080";
try {
    HostName host = new HostName(hostName);
    host.validate();
    InetSocketAddress address = host.asInetSocketAddress();
    HostName host2 = new HostName(hostName2);
    host2.validate();
    InetSocketAddress address2 = host2.asInetSocketAddress();
    HostName host3 = new HostName(hostName3);
    host3.validate();
    InetSocketAddress address3 = host3.asInetSocketAddress();
    // use socket address      
} catch (HostNameException e) {
    String msg = e.getMessage();
    // handle improperly formatted host name or address string
}
Sean F
  • 4,344
  • 16
  • 30
0

URI can accomplish this:

URI uri = new URI(null, "example.com:80", null, null, null);

Unfortunately, there's a bug in current OpenJDK (or in the documentation) where the authority isn't properly validated. The documentation states:

The resulting URI string is then parsed as if by invoking the URI(String) constructor and then invoking the parseServerAuthority() method upon the result

That call to parseServerAuthority just doesn't happen unfortunately so the real solution here that properly validates is:

URI uri = new URI(null, "example.com:80", null, null, null).parseServerAuthority();

then

InetSocketAddress address = new InetSocketAddress(uri.getHost(), uri.getPort());