1

I'm trying to run a pig query using RDP in HDInsight..

The query is

LOGS = LOAD 'wasb://containerName@storageAccountName.blob.core.windows.net/' as unparsedString:chararray;

where containerName & storageAccountName are my containerName and storageAccountName where my data exists..

Its throwing an error stating.. ERROR 1200: java.net.URISyntaxException: Relative path in absolute URI: wasb://containerName@storageAccountName.blob.core.windows.net.pig_schema

Failed to parse: java.net.URISyntaxException: Relative path in absolute URI: wasb://containerName@storageAccountName.blob.core.windows.net.pig_schema

Update: I saved the file in HDInsight Default container in a folder 'pigdata' and then the following worked..

LOGS = LOAD 'wasb:///pigdata' as unparsedString:chararray;

But,I would like if it was possible to get this working without saving to the default container. Any help is sincerely appreciated

Thanks

Arnab
  • 2,324
  • 6
  • 36
  • 60
  • There are two solutions for this issue. One is to change the container access to "public container". However everyone can read the data from the container. And the other solution is to add the storage account where you store your data as a linked resource to your HDInsight cluster. When you provision a cluster, you have the option to add additional storage accounts. – Jonathan Gao Jul 17 '15 at 14:00
  • The container is a 'public container'. Thanks for the info on additional storage accounts. Will try that.. – Arnab Jul 17 '15 at 15:59
  • @JonathanGao In my case my data is in a different container name but same storage account, so that was not the problem as well – Arnab Jul 19 '15 at 13:02

1 Answers1

1

You need to have your log data in a "folder", like pigdata, and not in the root of the container. Try moving your data into a root folder and changing the command.

EX: LOGS = LOAD 'wasb://containerName@storageAccountName.blob.core.windows.net/pigdata/'

Andrew Moll
  • 4,903
  • 2
  • 13
  • 15