I'm using Snowflake as my DWH and Spark for my ETL and I don't have Hive tables.
Is there an option to use Apache Kylin without the Hadoop ecosystem?

- 8,794
- 4
- 33
- 52

- 171
- 1
- 10
-
AFAIK Kylin has a major dependency on HBase. See software requirements @ http://kylin.apache.org/docs/install/index.html – mazaneicha Nov 11 '19 at 15:32
-
Great question. Their sales people say yes https://kyligence.io/blog/snowflake-the-good-the-bad-and-the-beautiful-for-analytics/ – KCD Jan 29 '21 at 20:41
3 Answers
It is pretty complex from what I have read. Some alternatives I would suggest in order to take advantage of analytics on distributed systems is to use Materialized views to filter the data you want from parts of each distributed system within Snowflake More on Materialized Views
And the Preview Feature, Data Exchange for query analytics?
I hope that helps, sorry I was not very helpful with Apache Kylin.

- 511
- 4
- 22
Kyligence Cloud which is based on the Apache Kylin core, but cloud offering built on AWS and Azure provides the capability to connect to Snowflake directly without Hadoop. Check here to learn more: https://kyligence.io/news/kyligence-releases-cloud-native-olap-for-azure-aws-and-google-cloud-platform/

- 126
- 5
-
You are right, they have it in the latest version, but I'm not sure how reliable it is. Thank you – raul7 Nov 15 '19 at 14:21
The answer is NO. It cannot read directly from Spark Data frames, the data sources it supports are Hive, Kafka and RDBMS.

- 171
- 1
- 10
-
Dataframes aren't stored anywhere, anyway, they are only a runtime format – OneCricketeer Nov 14 '19 at 15:12
-
I didn't downvote, and there's no way to see who has. The fact that you've said "read directly" doesn't imply you've stored the dataframe anywhere. Spark can write to Hbase, not only those places you listed – OneCricketeer Nov 15 '19 at 14:49
-
I apologise for blaming you then. As you've mentioned, Spark can write to HBase, but tis isn't what I need. I don't want it just on HBase, but rather as an input for the Kylin, which itself create its OLAP cube storing it in its storage (currently only HBase) – raul7 Nov 15 '19 at 15:13