Distribution for Platfora and Datameer

Question

I am interested in installing Platfora and Datameer analytic tools. My doubt is in the documentation of both these tools, we see for existing Hadoop distributions, they have given the compatibility list including CDH, HDP, and MapR. But I wanted to install these in existing plain Hadoop. I.e. I have installed Hadoop by downloading Apache Hadoop components one by one and prepared the cluster.

Will these tools work in this case?

score 5 · Answer 1 · answered Jun 10 '15 at 15:55

You can install Platfora using plain Apache Hadoop by selecting Hortonworks' HDP distribution - the core of the HDP distribution is plain Apache Hadoop. (I work at Platfora. We support many different Hadoop distros, but a lot of our development is actually done using plain Apache Hadoop).

Platfora uses your Hadoop cluster not only for input data, but by generating native MapReduce and Apache Spark jobs to process raw, high volume, structured or semi-structured input data (JSON, XML, Log files, CSV, Avro, data from Hive, output of other processing pipelines and libraries, you name it). This scales well, but having higher latency frameworks like MapReduce or Spark in your workflow for every change in your analysis questions gives you long turnaround times - bad for productivity. That's why Platfora accesses these intermediate results with a distributed, scale-out in-memory query engine that backs a low-latency visual discovery front-end. This kind of end-to-end approach makes it really easy to visualize and understand patterns across PBs of data with an interactive (sub-second) visual experience -- similar to Tableau but native to Hadoop and the scale and complexity of modern multi-structured data.

score 0 · Accepted Answer · answered May 14 '15 at 00:21

Yes it works. As long you use the "latest stabile" Yarn, HDFS and Map Reduce version - Datameer will work without any issue. Anything beyond that does not matter since Datameer is not using Hive, Oozie or any other component but brings things like tez, spark etc pre-packaged within the application and runs it transparent for you on Yarn. As of today we support 50 different versions of Hadoop.

I obviously can't speak for Platflora but they don't really run native on Hadoop anyhow but just pull data out of Hadoop into their in memory columnar database that runs on an extra cluster: + extra expansive hardware (memory intense) + structured data only since SQL (remember Hadoop was build as NO-SQL) + small data only (since in memory) + no advanced analytics like graph analytics since SQL based

HTH Stefan (I work at Datameer)

Hi,thanks for your response. I will try to install Datameer on one of my machines with native hadoop.Just a quick question, i want to run for evaluation purpose, can i install without license? — user234202, May 18 '15 at 06:15
Anyways, I got the link to install rpm package for trail version of Datameer. — user234202, May 18 '15 at 08:08

Distribution for Platfora and Datameer

2 Answers2