1

I am new to Hadoop/Giraph and Java. As part of a task, I downloaded Cloudera Quickstart VM and Giraph on top of it. I am using this book named "Practical Graph Analytics with Apache Giraph; Authors: Shaposhnik, Roman, Martella, Claudio, Logothetis, Dionysios" from which I tried to run the first example on Page 111 (Twitter Followership Graph).

Please find the below error while trying to run the changed pom.xml file with the hadoop version on the cluster 2.6.0-mr1-cdh5.12.0

`[cloudera@quickstart first]$ mvn clean install
[INFO] Scanning for projects...
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building book-examples 1.0.0
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:2.4.1:clean (default-clean) @ book-examples ---
[INFO] Deleting /home/cloudera/workspace/first/target
[INFO] 
[INFO] --- maven-resources-plugin:2.5:resources (default-resources) @ book-examples ---
[debug] execute contextualize
[WARNING] Using platform encoding (UTF-8 actually) to copy filtered resources, i.e. build is platform dependent!
[INFO] Copying 0 resource
[INFO] 
[INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ book-examples ---
[WARNING] File encoding has not been set, using platform encoding UTF-8, i.e. build is platform dependent!
[INFO] Compiling 1 source file to /home/cloudera/workspace/first/target/classes
[INFO] -------------------------------------------------------------
[ERROR] COMPILATION ERROR : 
[INFO] -------------------------------------------------------------
[ERROR] /home/cloudera/workspace/first/src/main/java/GiraphHelloWorld.java:[5,27] error: package org.apache.hadoop.io does not exist
[ERROR] /home/cloudera/workspace/first/src/main/java/GiraphHelloWorld.java:[6,27] error: package org.apache.hadoop.io does not exist
[ERROR] /home/cloudera/workspace/first/src/main/java/GiraphHelloWorld.java:[7,29] error: cannot find symbol
[ERROR]  package org.apache.hadoop.util
/home/cloudera/workspace/first/src/main/java/GiraphHelloWorld.java:[14,17] error: cannot find symbol
[ERROR]  class IntWritable
/home/cloudera/workspace/first/src/main/java/GiraphHelloWorld.java:[14,30] error: cannot find symbol
[ERROR]  class IntWritable
/home/cloudera/workspace/first/src/main/java/GiraphHelloWorld.java:[15,0] error: cannot find symbol
[ERROR]  class NullWritable
/home/cloudera/workspace/first/src/main/java/GiraphHelloWorld.java:[15,14] error: cannot find symbol
[ERROR]  class NullWritable
/home/cloudera/workspace/first/src/main/java/GiraphHelloWorld.java:[17,28] error: cannot find symbol
[ERROR]  class GiraphHelloWorld
/home/cloudera/workspace/first/src/main/java/GiraphHelloWorld.java:[18,3] error: cannot find symbol
[ERROR]  class GiraphHelloWorld
/home/cloudera/workspace/first/src/main/java/GiraphHelloWorld.java:[18,16] error: cannot find symbol
[ERROR]  class GiraphHelloWorld
/home/cloudera/workspace/first/src/main/java/GiraphHelloWorld.java:[19,12] error: cannot find symbol
[ERROR]  class GiraphHelloWorld
/home/cloudera/workspace/first/src/main/java/GiraphHelloWorld.java:[24,12] error: cannot find symbol
[ERROR]  class GiraphHelloWorld
/home/cloudera/workspace/first/src/main/java/GiraphHelloWorld.java:[24,25] error: cannot find symbol
[ERROR]  class GiraphHelloWorld
/home/cloudera/workspace/first/src/main/java/GiraphHelloWorld.java:[34,14] error: cannot find symbol
[INFO] 14 errors 
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 6.495s
[INFO] Finished at: Fri Dec 08 14:57:01 PST 2017
[INFO] Final Memory: 18M/57M

`

I added the Cloudera Repository as per a fellow Stack overflow response. Please find the updated pom xml for which the above error applies:

`<?xml version="1.0" encoding="UTF-8"?>
<project>
<modelVersion>4.0.0</modelVersion>

<groupId>giraph</groupId>
<artifactId>book-examples</artifactId>
<version>1.0.0</version>

<dependencies>
<dependency>
<groupId>org.apache.giraph</groupId>
<artifactId>giraph-core</artifactId>
<version>1.1.0</version>
</dependency>

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>2.6.0-mr1-cdh5.12.0</version>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>2.4</version>
                <executions>
                    <execution>
                        <id>create-jar-bundle</id>
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                        <configuration>
                            <descriptorRefs>
                                <descriptorRef>jar-with-dependencies</descriptorRef>
                            </descriptorRefs>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
</plugins>
</build>

<repositories>
        <repository>
            <id>cloudera</id>
            <url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
            <releases>
                <enabled>true</enabled>
            </releases>
            <snapshots>
                <enabled>true</enabled>
            </snapshots>
        </repository>
    </repositories>

</project>`

The version in book for hadoop is 1.2.1. There is an issue with the book dependencies between two versions.

It would be great if anyone helps me understand how to handle this error.

Thanks in advance.

  • Please don't upload code or error trace as image. See [Why not upload images of code on SO when asking a question?](https://meta.stackoverflow.com/questions/285551/why-not-upload-images-of-code-on-so-when-asking-a-question) – hoefling Dec 08 '17 at 20:35

1 Answers1

1

The pom.xml in your book's copy is outdated. Use this one instead. Source: book examples repository on Github.

Edit:

You want to use a recent version of hadoop-core, but the most recent one Maven Central Repository (the default respository) offers is the 1.2.1. You will need to use the Cloudera Repository to get the most recent version of the library. To do that, simply add the repository to your pom.xml:

<?xml version="1.0" encoding="UTF-8"?>
<project>
    <modelVersion>4.0.0</modelVersion>
    <groupId>giraph</groupId>
    <artifactId>book-examples</artifactId>
    <version>1.0.0</version>

    <dependencies>
        <dependency>
            <groupId>org.apache.giraph</groupId>
            <artifactId>giraph-core</artifactId>
            <version>1.1.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-core</artifactId>
            <version>2.6.0-mr1-cdh5.12.0</version>
        </dependency>
    </dependencies>

    <build>
    </build>

    <repositories>
        <repository>
            <id>cloudera</id>
            <url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
            <releases>
                <enabled>true</enabled>
            </releases>
            <snapshots>
                <enabled>true</enabled>
            </snapshots>
        </repository>
    </repositories>
</project>

You should now see that Maven tries to find the jars at Cloudera first, falling back to the Central:

$ mvn clean install
[INFO] Scanning for projects...
[INFO] 
[INFO] ------------------------------------------------------------------------
[INFO] Building book-examples 1.0.0
[INFO] ------------------------------------------------------------------------
...
Downloading: https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/hadoop/hadoop-core/2.6.0-mr1-cdh5.12.0/hadoop-core-2.6.0-mr1-cdh5.12.0.pom
Downloaded: https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/hadoop/hadoop-core/2.6.0-mr1-cdh5.12.0/hadoop-core-2.6.0-mr1-cdh5.12.0.pom (6.4 kB at 2.9 kB/s)
...
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.661 s
[INFO] Finished at: 2017-12-08T22:56:04+01:00
[INFO] Final Memory: 15M/224M
[INFO] ------------------------------------------------------------------------

Edit 2:

Ok, I finally got it. Since version 2, Hadoop changed its packaging so instead of declaring a dependency on hadoop-core, you should use the hadoop-client which is a metapackage that aggregates all the necessary dependencies for you. Remove the hadoop-core dependency from your pom.xml and add instead:

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-client</artifactId>
    <version>2.9.0</version>
</dependency>
hoefling
  • 59,418
  • 12
  • 147
  • 194
  • Thank you but I used the same pom.xml code. The issue is with the hadoop version as the hadoop version on cluster is 2.6.0 but we are using 1.2.1 in the xml. So I changed the version to 2.6.0 and tried running. That's when I got that pasted error. –  Dec 08 '17 at 21:26
  • @tri7 Oh, I see - the issue is that the central repository (default one maven uses) does only contain `hadoop-core-1.2.1`. Let me update the answer to fix that. – hoefling Dec 08 '17 at 21:43
  • Thanks a lot @hoefling. But it still did not work.Please find the updated pom file that I am using. –  Dec 08 '17 at 22:57
  • @tri7 Then you will have to update your question, adding the error you get now, because I tested it from scratch and it works, at least with the `Maven 3.5.2` I use. – hoefling Dec 08 '17 at 23:02
  • Thank you @hoefling . I updated the question with current error and pom xml. Please take a look. –  Dec 08 '17 at 23:09
  • Great. That worked! But there is underlying issue is not resolved with running the GiraphHelloWorls program!! –  Dec 09 '17 at 00:47
  • @tri7 if this does resolve your question, you may consider accepting the answer then... – hoefling Dec 09 '17 at 00:48
  • Yes , i did accept it :) Thanks a lot! I updated my question with another error. Please check if you can. @hoefling –  Dec 09 '17 at 00:52
  • @tri7 I would suggest to start a new question for that error as it is not a Maven error anymore and I am no expert on hadoop :-( – hoefling Dec 09 '17 at 00:55
  • Makes sense! Thank you for being patient as I am super new to SO too. I am starting a new question. –  Dec 09 '17 at 00:57
  • @tri7 no problem and glad I could help you for the start! ;-) – hoefling Dec 09 '17 at 00:57