9

I have installed Pig 0.12 in my machine. when I run

darwin$ pig
grunt> ls /data/
hdfs://Nmame:10001/data/pg20417.txt<r 3>    674570
hdfs://Nname:10001/data/pg4300.txt<r 3> 1573150
hdfs:/Nname:10001/data/pg5000.txt<r 3>  1423803
hdfs://Nname:10001/data/weather <dir>

but when I try to create a query, I get the following error:

grunt> book = load '/data/pg4300.txt' as (lines:chararray);
2014-06-30 17:40:08,939 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Encountered " <PATH> "book=load "" at line 2, column 1.
Was expecting one of:
    <EOF> 
    "cat" ...
    "clear" ...
    "fs" ...
    "sh" ...
    "cd" ...
    "cp" ...
    "copyFromLocal" ...
    "copyToLocal" ...
    "dump" ...
    "\\d" ...
    "describe" ...
    "\\de" ...
    "aliases" ...
    "explain" ...
    "\\e" ...
    "help" ...
    "history" ...
    "kill" ...
    "ls" ...
    "mv" ...
    "mkdir" ...
    "pwd" ...
    "quit" ...
    "\\q" ...
    "register" ...
    "rm" ...
    "rmf" ...
    "set" ...
    "illustrate" ...
    "\\i" ...
    "run" ...
    "exec" ...
    "scriptDone" ...
    "" ...
    "" ...
    <EOL> ...
    ";" ...

Details at logfile: /Users/Documents/pig_1404175088198.log

I tried changingload to LOAD and as to AS but nothing worked.

brain storm
  • 30,124
  • 69
  • 225
  • 393
  • remove / before data. Make it: book = load 'data/pg4300.txt' as (lines:chararray); – gonephishing Jul 01 '14 at 04:44
  • When you say you installed it, do you mean that you took a pre-built release or did you build it yourself from the source ? – merours Jul 01 '14 at 08:12
  • what's the current status of your problem. Did it work? Or what else did you try? – gonephishing Jul 01 '14 at 15:42
  • 2
    @gonephishing: The problem seems not be with using `/data`, because thats where my data folder. it is not in `/user/data` to load it as default directory. but the real problems is with some versioning differences between Hadoop2.2 and pig. I need to `ant clean jar-all -Dhadoopversion=23` to fix it – brain storm Jul 01 '14 at 18:09
  • Facing same problem. Any solution? – krackoder Jul 25 '14 at 22:39
  • @nishant: Nope, I did not fix it, if you happen to, please post here – brain storm Jul 25 '14 at 22:41

6 Answers6

12

I ran into the same issue and was looking for a solution. Turns out this happens if you do not give space. book=load will give you an error. book = load will work. I am not sure if this is an expected behavior.

Dhanesh
  • 1,009
  • 1
  • 11
  • 16
0

Try the following solutions it should work -

1) Remove the /data/ there is no need to use absolute paths if your data is present in HDFS default directory. I am assuming that /data is the default directory where you are storing all your data -

book = load 'pg4300.txt' as (lines:chararray);

2) Try using PigStorage to specify the delimiter. I am using comma as the delimiter here you can replace it with the one you are using -

book = load 'pg4300.txt' using PigStorage(',') as (lines:chararray);

Hope this helps.

Patrick M
  • 10,547
  • 9
  • 68
  • 101
Rajnish G
  • 237
  • 3
  • 8
0

Try using PigStorage to identify how your data should be read into book

http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#PigStorage

lalala
  • 1
0

Throws above Error: stocks_by_symbol=GROUP stocks by stock_symbol;

Works Great: stocks_by_symbol = GROUP stocks by stock_symbol;

Notice the space before and after "="

Suneel
  • 81
  • 5
0

this error occurred when tried to dump an alias (incorrect name) in grunt prompt. for example, instead of dump r45 , typed as dump 45 , so it throws the above error.

After providing the correct alias name, it works fine.

And also, make sure that you are executing the query from the location where the load file exist.

loki
  • 9,816
  • 7
  • 56
  • 82
Boopathi
  • 21
  • 1
0

I have faced the same issue and i have resolved it by providing space in between = and LOAD key word.

Stephen Rauch
  • 47,830
  • 31
  • 106
  • 135