1

Whenever i run any apache pig code from the terminal everythig goes well and i get the result. So i conclude that my installation for Pig 0.15.0 and Hadoop 2.7.0 is alright. The problem is when i run the pigServer from inside java code:

 PigServer pigServer = new PigServer(ExecType.MAPREDUCE, conf);
 pigServer.setBatchOn();
 pigServer.debugOff();
 pigServer.setJobName(JobId);
 pigServer.registerScript(scriptUrl, params);
 pigServer.executeBatch();

My maven dependencies are:

        <dependency>
            <groupId>org.apache.pig</groupId>
            <artifactId>pig</artifactId>
            <version>0.15.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>2.7.0</version>
        </dependency>

I get the following error.

WARN  org.apache.pig.backend.hadoop20.PigJobControl - falling back to default JobControl (not using hadoop 0.20 ?)
java.lang.NoSuchFieldException: runnerState
    at java.lang.Class.getDeclaredField(Class.java:1948)
    at org.apache.pig.backend.hadoop20.PigJobControl.<clinit>(PigJobControl.java:51)
    at org.apache.pig.backend.hadoop.executionengine.shims.HadoopShims.newJobControl(HadoopShims.java:100)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:313)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:199)
    at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:277)
    at org.apache.pig.PigServer.launchPlan(PigServer.java:1367)
    at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1352)
    at org.apache.pig.PigServer.execute(PigServer.java:1341)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:392)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:375)
    at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:170)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:232)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
    at org.apache.pig.Main.run(Main.java:479)

I used to run the above code on Hadoop 1 and it was working but now it is not.

Abdulrahman
  • 433
  • 4
  • 11

1 Answers1

0

By default pig uses Hadoop 0.20 version to so while running pig assumes that you are using Hadoop 0.20 so you are getting that error

You can run Pig with different versions of Hadoop by setting HADOOP_HOME to point to the directory where you have installed Hadoop. If you do not set HADOOP_HOME, by default Pig will run with the embedded version, currently Hadoop 0.20.2.--Written in Apache pig site https://pig.apache.org/docs/r0.9.2/start.html

Set HADOOP_HOME in eclipse

Run Configurations-->ClassPath-->User Entries-->Advanced-->Add ClassPath Variables-->New-->Name(HADOOP_HOME)-->Path(You Hadoop directory path)

Maven dependencies required

<dependencies>
    <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-hdfs</artifactId>
    <version>2.7.1</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-client</artifactId>
    <version>2.7.1</version>
</dependency>

<dependency>
    <groupId>commons-io</groupId>
    <artifactId>commons-io</artifactId>
    <version>2.4</version>
</dependency>
<dependency>
    <groupId>log4j</groupId>
    <artifactId>log4j</artifactId>
    <version>1.2.16</version>
</dependency>
 <dependency>
    <groupId>org.apache.pig</groupId>
    <artifactId>pig</artifactId>
    <version>0.15.0</version>
</dependency>

<dependency>
    <groupId>org.antlr</groupId>
    <artifactId>antlr-runtime</artifactId>
    <version>3.4</version>
</dependency>
</dependencies>