I am trying to setup apache drill in distributed mode. I already have cloudera hadoop cluster with a master and 2 slaves. From documentation given on apache drill, its not pretty clear if it can be set up with typical cloudera cluster. I could not find any relevant articles. Any kind of help will be appreciated.
Asked
Active
Viewed 4,359 times
3 Answers
2
Drill can be installed along with Cloudera on the nodes of the cluster independently - and would be able to query the files on HDFS. Refer the link for installation details - https://cwiki.apache.org/confluence/display/DRILL/Deploying+Apache+Drill+in+a+Clustered+Environment

Yash Sharma
- 1,674
- 2
- 16
- 23
-
That link is not public at present :\ – captainpete Oct 22 '15 at 23:08
-
This worked for me https://drill.apache.org/docs/installing-drill-on-the-cluster/ – captainpete Oct 23 '15 at 01:38
2
I got this working with cloudera hadoop distribution. I already had cloudera cluster installed with all services running.
perform following steps:
- Install apache drill on all nodes of the cluster.
- Run drill/bin/drillbit.sh on each node.
- Configure storage plugin for dfs using apache drill webinterface at host:8047. Update HDFS configurations here.
- Run Sqlline : ./sqlline -u jdbc:drill:zk=host1:2181,host2:2181,host3:2181 (2181 is the port number used by zookeeper.)

JAY G
- 553
- 2
- 12
- 21
-
Can you please post an example DFS configuration? In my scenario the JSON is accepted by the web interface but sqline doesn't seem to see those changes. – Havnar May 04 '15 at 07:06
1
It may only work with a rudimentary insecure cluster as Drill currently isn't tested / documented to integrate with HDFS + Kerberos for secure Hadoop clusters. Vote and check back on this ticket for Drill secure HDFS support:

Hari Sekhon
- 91
- 1
- 4