Questions tagged [apache-drill]

Apache Drill is a low-latency distributed query engine for large-scale datasets, including structured and semi-structured/nested data.It is capable of querying nested data in formats like JSON and Parquet and performing dynamic schema discovery.

Drill is an Apache open-source SQL query engine for Big Data exploration. Drill is designed from the ground up to support high-performance analysis on the semi-structured and rapidly evolving data coming from modern Big Data applications, while still providing the familiarity and ecosystem of ANSI SQL, the industry-standard query language. Drill provides plug-and-play integration with existing Apache Hive and Apache HBase deployments.

Drill supports a variety of NoSQL databases and file systems, including HBase, MongoDB, MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files. A single query can join data from multiple datastores.

Recommended reference sources:

644 questions
3
votes
1 answer

Apache Drill 1.6.0 failure in starting embedded Drillbit (windows)

I am unable to start Embedded drillbit on windows machine and getting the following error. I have checked for the jars in 3rd party folder where Jackson-databind-2.7.1.jar is present, still it's saying class not found exception. Can you help me…
3
votes
2 answers

Does Apache Drill Supports Multiple Query at a time.?

I want to run select Sql Server Query at one time. I am using Drill in embedded mode. select * from ..; select * from ..; Example:- select * from…
Naveen D
  • 121
  • 2
  • 17
3
votes
2 answers

Apache Drill-embedded can't connect due to VPN

I'm trying to use Apache Drill in embedded mode (drill-embedded) however when it starts it shows an error: Error: Failure in connecting to Drill: org.apache.drill.exec.pc.RpcException: CONNECTION : io.netty.channel.ConnectTimeoutException:…
Rob L
  • 31
  • 3
3
votes
2 answers

Apache Drill > sqlline: how to run a sql script containing variable

I am a newbie of Apache Drill, and I need to run a SQL script through sqlline. In most SQL client, it is allowed to use some variables in sqlline, so hereby I would like to ask that is it possible to use variables in sqlline of Apache Drill?
Rui
  • 3,454
  • 6
  • 37
  • 70
3
votes
2 answers

Newbie in Apache Drill: Cannot see Web Console

I downloaded apache-drill-1.2.0 on a ubuntu 14.04 64 box. Extracted the tar.zip contents, went to bin folder and ran drill. Now I tried to open: http://localhost:8047, but I'm getting a "can't establish a connection to server" error. I tried to…
Himanshu Gautam
  • 359
  • 1
  • 4
  • 17
3
votes
1 answer

How to query on an array?

I have an object like this in drill: {MyFruit: [{name:Mike, age:10},{name:Jacob,age:9},{name:William, age:6}]} I can get "Mike" by doing: Select MyFruit[0].name Is there a way for me to get the list of every single "name"? I tried the following…
Rolando
  • 58,640
  • 98
  • 266
  • 407
3
votes
2 answers

unable to query on RDBMS using apache drill

With apache drill 1.2, we can query over RDBMS data. Check more here: https://drill.apache.org/blog/2015/10/16/drill-1.2-released/ so, I tried to add a plugin for MySQL. I am doing it using the web client. I created a plugin with name mysql and…
Dev
  • 13,492
  • 19
  • 81
  • 174
3
votes
0 answers

How to query a whole S3 directory in Apache Drill?

I'm trying to query a whole directory in S3 containing parquet files. The query hangs for a while, then returns an error: 0: jdbc:drill:zk=local> select * from s3.`/data/dt=2015-10-15` limit 10; Error: CONNECTION ERROR: Connection…
Michael Spector
  • 36,723
  • 6
  • 60
  • 88
3
votes
2 answers

Apache Drill using Google Cloud Storage

The Apache Drill features list mentions that it can query data from Google Cloud Storage, but I can't find any information on how to do that. I've got it working fine with S3, but suspect i'm missing something very simple in terms of Google Cloud…
MJM
  • 357
  • 1
  • 4
  • 16
3
votes
0 answers

How to generate one single csv from Apache drill query

I was trying to generate one single csv table from a query in drill, but it turns out that the resulting data folder has multiple csv files with almost equal file size. How should I set the query so that it only generate one single csv file?…
xyin
  • 417
  • 2
  • 7
  • 19
3
votes
1 answer

how to add more storage plugins programatically in apache drill?

I tried drill JDBC driver to query programmatically. Useful portion of code: Connection conn = new Driver().connect("jdbc:drill:zk=local", getDefaultProperties()); Statement stmt = conn.createStatement(); ResultSet rs = stmt.executeQuery("show…
Dev
  • 13,492
  • 19
  • 81
  • 174
3
votes
1 answer

Adding Multiple Storage Plugins in Apache Drill for S3

I want to take a join on tables in two different S3 sources using Apache Drill. Is there a way to add AWS access keys and secret keys in conf/core-site.xml file? Or is there any other way around?
asymptote
  • 1,133
  • 8
  • 15
3
votes
0 answers

Why does apache drill return Null pointer exception when querying a parquet file with a optional Null column?

I am writing to parquet file using protobuf (or Avro). my proto file looks like this: message Log { optional string date = 1; optional string url = 2; } it is a reduced version of my problem. Now when writing to a parquet file…
3
votes
2 answers

Fetching nested JSON data in HBase using Apache Drill

I am using Apache Drill to run SQL queries on a HBase table. The value in one of the columns is: 0: jdbc:drill:schema:hbase:zk=localhost> select cast(address['street'] as varchar(20)) from hbase.students; +------------+ | EXPR$0 …
isubuz
  • 728
  • 1
  • 5
  • 8
3
votes
3 answers

Can apache drill work with cloudera hadoop?

I am trying to setup apache drill in distributed mode. I already have cloudera hadoop cluster with a master and 2 slaves. From documentation given on apache drill, its not pretty clear if it can be set up with typical cloudera cluster. I could not…
JAY G
  • 553
  • 2
  • 12
  • 21