I have a large Java application I'm trying to run on a fargate cluster in AWS. The image runs successfully on my local machine's docker. When I run it in fargate it starts successfully, but eventually encounters the following error after which the application gets stuck:
! java.net.UnknownHostException: 690bd678bcf4: 690bd678bcf4: Name or service not known
! at java.net.InetAddress.getLocalHost(InetAddress.java:1505) ~[na:1.8.0_151]
! at tracelink.misc.SingletonTokenDBO$.<init>(SingletonTokenDBO.scala:34) ~[habari.jar:8.4-QUARTZ-SNAPSHOT]
! at tracelink.misc.SingletonTokenDBO$.<clinit>(SingletonTokenDBO.scala) ~[habari.jar:8.4-QUARTZ-SNAPSHOT]
!... 10 common frames omitted
Caused by: ! java.net.UnknownHostException: 690bd678bcf4: Name or service not known
! at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) ~[na:1.8.0_151]
! at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) ~[na:1.8.0_151]
! at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) ~[na:1.8.0_151]
! at java.net.InetAddress.getLocalHost(InetAddress.java:1500) ~[na:1.8.0_151]
!... 12 common frames omitted
The offending line of Scala code is:
private val machineName = InetAddress.getLocalHost().getHostName()
Some initial research suggests the error is related to the contents of the /etc/hosts file in the container. So I created a small test program that exhibits the same behavior as my real application, and also dumps the contents of /etc/hosts to stdout:
import java.net.*;
import java.io.*;
public class NetworkTest {
public static void main(String[] args) throws InterruptedException, IOException, FileNotFoundException {
while(true) {
networkDump();
Thread.sleep(10000);
}
}
private static void networkDump() throws IOException, FileNotFoundException {
System.out.println("/etc/hosts:");
System.out.println("");
FileReader f = new FileReader("/etc/hosts");
BufferedReader reader = new BufferedReader(f);
String line = null;
while((line = reader.readLine()) != null) {
System.out.println(line);
}
System.out.println("");
dumpHostname();
}
private static void dumpHostname() {
try {
String hostname = InetAddress.getLocalHost().getHostName();
System.out.printf("Hostname: %s\n\n", hostname);
} catch(UnknownHostException e) {
System.out.println(e.getMessage());
}
}
}
Dockerfile:
FROM openjdk:8
WORKDIR /site
ADD . /site
CMD ["java", "NetworkTest"]
The output I get from this in AWS looks like:
/etc/hosts:
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
3a5a4271a6e3: 3a5a4271a6e3: Name or service not known
Compared with this output running in docker on my local machine:
> docker run networktest
/etc/hosts:
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
172.17.0.4 82691e2fb948
Hostname: 82691e2fb948
The local version that does not get the exception has an entry in /etc/hosts for the hostname, while the AWS hosts file has no entry for the hostname. I've tried adding an /etc/rc.local file to manually add the hostname to the end of the localhost line, and just adding a RUN command in the Dockerfile to do the same thing. Neither has had any effect.
Does anyone know if there's a way to configure either the image or the ECS task definition to get the hostname properly configured in AWS?