0

I am using Cascalog on Eclipse . it looks like the dependency on hadoop is provided in project.clj file of project like below

:profiles { :dev {:dependencies [[org.apache.hadoop/hadoop-core "1.1.2"]]}}

If i have to include dependency on locally installed Hadoop single node cluster or some external hadoop cluster --how should i do ? If it is local should i simply put the "path to hadoop" in place of "org.apache.hadoop" ? Your ideas would be appreciated.

Best Regards, Sindhu

Sindhu
  • 11
  • 1

2 Answers2

0

Sindhu, specification of your cluster's location isn't appropriate in project.clj.

project.clj is to clojure what pom.xml is to java/maven. Check out the tutorial on leinengen's dependency management here. You should make sure that the version you declare dependency on matches what you'll be running against.

The cluster you end up running on is controlled in hadoop conf files - specifically by changing the location of your job tracker with "mapred.job.tracker" in mapred-site.xml. You can read about them here

Walt Elder
  • 31
  • 4
0

As I could check on a link found on cascalog guides Running on a cluster Developing and deploying a Cascalog query on a Hadoop cluster http://nathanmarz.com/blog/news-feed-in-38-lines-of-code-using-cascalog.html you can find the paragraph Running on a production cluster and here the copy/paste

1- Copy the sample data onto your cluster to "/tmp/follows" and "/tmp/action".

2- Next, run "lein uberjar" to create a jar containing the program with all its dependencies. Since the demo code specifies :gen-class and has a main method, we can run it just like any other hadoop program. To run the query on a cluster and output the results in text format to "/tmp/results", run:

3- hadoop jar cascalog-demo-standalone.jar cascalog_demo.demo /tmp/follows /tmp/action /tmp/results

tangrammer
  • 3,041
  • 17
  • 24
  • Thanks tangrammer,i am following the above method .I have problem creating a jar using leon uberjar. But i am unable to get a jar bcoz i have an error :Exception in thread "main" java.lang.ClassNotFoundException: org.apache.hadoop.fs.FileSystem,Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FileSystem .whats is the fix . I don't understand – Sindhu May 01 '14 at 11:45