3

I am trying to setup apache drill in distributed mode. I already have cloudera hadoop cluster with a master and 2 slaves. From documentation given on apache drill, its not pretty clear if it can be set up with typical cloudera cluster. I could not find any relevant articles. Any kind of help will be appreciated.

Dev
  • 13,492
  • 19
  • 81
  • 174
JAY G
  • 553
  • 2
  • 12
  • 21

3 Answers3

2

Drill can be installed along with Cloudera on the nodes of the cluster independently - and would be able to query the files on HDFS. Refer the link for installation details - https://cwiki.apache.org/confluence/display/DRILL/Deploying+Apache+Drill+in+a+Clustered+Environment

Yash Sharma
  • 1,674
  • 2
  • 16
  • 23
2

I got this working with cloudera hadoop distribution. I already had cloudera cluster installed with all services running.

perform following steps:

  1. Install apache drill on all nodes of the cluster.
  2. Run drill/bin/drillbit.sh on each node.
  3. Configure storage plugin for dfs using apache drill webinterface at host:8047. Update HDFS configurations here.
  4. Run Sqlline : ./sqlline -u jdbc:drill:zk=host1:2181,host2:2181,host3:2181 (2181 is the port number used by zookeeper.)
JAY G
  • 553
  • 2
  • 12
  • 21
  • Can you please post an example DFS configuration? In my scenario the JSON is accepted by the web interface but sqline doesn't seem to see those changes. – Havnar May 04 '15 at 07:06
1

It may only work with a rudimentary insecure cluster as Drill currently isn't tested / documented to integrate with HDFS + Kerberos for secure Hadoop clusters. Vote and check back on this ticket for Drill secure HDFS support:

https://issues.apache.org/jira/browse/DRILL-3584

Hari Sekhon
  • 91
  • 1
  • 4