I am running an Hadoop Cluster on Google Cloud Platform, using Google Cloud Storage as backend for persistent data. I am able to ssh to the master node from a remote machine and run hadoop fs commands. Anyway when I try to execute the following code I get a timeout error.
Code
FileSystem hdfs =FileSystem.get(new URI("hdfs://mymasternodeip:8020"),new Configuration());
Path homeDir=hdfs.getHomeDirectory();
//Print the home directory
System.out.println("Home folder: " +homeDir);
// Create a directory
Path workingDir=hdfs.getWorkingDirectory();
Path newFolderPath= new Path("/DemoFolder");
newFolderPath=Path.mergePaths(workingDir, newFolderPath);
if(hdfs.exists(newFolderPath))
{
hdfs.delete(newFolderPath, true); //Delete existing Directory
}
//Create new Directory
hdfs.mkdirs(newFolderPath);
When executing the hdfs.exists() command I get a timeout error.
Error
org.apache.hadoop.net.ConnectTimeoutException: Call From gl051-win7/192.xxx.1.xxx to 111.222.333.444.bc.googleusercontent.com:8020 failed on socket timeout exception: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=111.222.333.444.bc.googleusercontent.com/111.222.333.444:8020]
Are you aware of any limitation using the Java Hadoop APIs against Hadoop on Google Cloud Platform ?
Thanks!