I have a requirement to deploy a presto server which can help me query data stored in ADLS in Avro file formats. I have gone through this tutorial and it seems that the Hive is used as a catalogue/connector in presto to query from ADLS. Can I bypass Hive and have any connector to extract data from ADLS?
Asked
Active
Viewed 894 times
1 Answers
2
Can I bypass Hive and have any connector to extract data from ADLS?
No.
Hive here plays two roles here:
- storage for metadata. It contains information like:
- schema and table name
- columns
- data format
- data location
- execution
- it is capable to read data from (HDFS) distributed file systems (like HDFS, S3, ADLS)
- it tells how execution can be distributed.

kokosing
- 5,251
- 5
- 37
- 50
-
Thanks for this crucial information. If I have some data in ADLS which is not coming through Hive(meaning metastore for this data wont be there) then how can I query that data using Presto? – Bhanuday Birla Feb 28 '19 at 12:39
-
You need to create an external table with location that would point that data. – kokosing Feb 28 '19 at 12:41
-
So if i know the location of data, then I can use select * from hive.default.LOCATION without any schema, Right? – Bhanuday Birla Feb 28 '19 at 12:45