1

I have a map-reduce job and the reducer gets an absolute address of a file residing on the Azure Blob storage and the reducer should opens it and read its content. I add the storage account containing the files when provisioning my Hadoop cluster (HDInsight). So the reducer must have access to this Blob storage but as the Blob Storage is not the default HDFS storage for my job. I have the following code in my reducer, but it gives me a FileNotFound error message.

FileSystem fs = FileSystem.get(new Configuration());
Path pt = new Path("wasb://mycontainer@accountname..."); 
FSDataInputStream stream = fs.open(pt);
HHH
  • 6,085
  • 20
  • 92
  • 164
  • Maybe you should use [NativeAzureFileSystem](https://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/azure/NativeAzureFileSystem.html)? I can't find examples in hadoop documentation, but with respect to [tests in source code](https://www.codatlas.com/github.com/apache/hadoop/HEAD/hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/AzureBlobStorageTestAccount.java?line=559) it probably must be something like: `NativeAzureFileSystem fs = new NativeAzureFileSystem(); fs.initialize(accountUri, conf);` – Leonid Vasilev Jun 19 '15 at 19:55

1 Answers1

1

It is covered in https://azure.microsoft.com/en-us/documentation/articles/hdinsight-hadoop-use-blob-storage/#addressing

The syntax is wasb://mycontainer@myaccount.blob.core.windows.net/example/jars/hadoop-mapreduce-examples.jar

If "mycontainer" is a private container, you must add "myaccount" azure storage account as an additional storage account during provision process.

Jonathan Gao
  • 599
  • 3
  • 9
  • Thanks, but my main question is which ap I should use ? Any sample code? – HHH Jun 17 '15 at 05:48
  • 1
    If you have an HDInsight cluster then you should already have sample code. Look in the blob storage that the HDInsight cluster is referencing under: /example/jars/hadoop-mapreduce-examples.jar. Then replace the "wasb:" address with the full URL of the external blob storage mentioned by Jonathan Gao. – Phuc H Duong Jun 18 '15 at 16:25
  • I edited my original post and mentioned the problem I had. Btw, I have the hadoop-mapreduce-examples.jar but it doesn't contain the source codes! – HHH Jun 18 '15 at 16:54