Questions tagged [mapr]

MapR is a commercial data platform that offers a HDFS compatible distributed file system, a database that allows to store data in BigTable or JSON and a streaming platform for messaging. MapR leverages APIs from open source tools such as Hadoop, Kafka, HBase and provides a proprietary implementation written in C optimised for improved performance.

MapR is a complete enterprise-grade distribution for Apache Hadoop. The MapR Converged Data Platform has been engineered to improve Hadoop’s reliability, performance, and ease of use.

The MapR distribution provides a full Hadoop stack that includes the MapR File System (MapR-FS), the MapR-DB NoSQL database management system, MapR Streams, the MapR Control System (MCS) user interface, and a full family of Hadoop ecosystem projects. You can use MapR with Apache Hadoop, HDFS, and MapReduce APIs.

MapR supports the Hadoop 2.x architecture and YARN (Yet Another Resource Negotiator). Hadoop 2.x and YARN make up a resource management and scheduling framework that distributes resource management and job management duties.

enter image description here

There are three MapR editions.

  • MapR Community Edition (formerly M3)
    • Free community edition.
  • MapR Enterprise Edition (formerly M5)
    • Adds high availability and data protection, including multi-node NFS.
  • MapR Enterprise Database Edition (formerly M7)
    • Adds structured table data natively in the storage layer and provides a flexible NoSQL database.

MapR can be installed on many versions of Red Hat Enterprise linux, CentOS, Ubuntu, Oracle Linux, and SUSE. A full matrix of supported Linux operating systems can be found here.

To install MapR the following requirements are needed.

  • A 64-bit CPU.
  • One of the above mentioned operating systems. (Red Hat Enterprise linux, CentOS, Ubuntu, Oracle Linux, or SUSE)
  • A minimum of 8GB of RAM.
  • At least one single unformatted disk.
  • A Resolvable hostname.
  • A common user on each server you wish to install MapR on.
  • Java 1.7.0 or higher.
  • Other
    • NTP, Syslog, PAM



Try MapR

Download the MapR Sandbox for VMware or Virtualbox for free.

OR

Install MapR on your own. Check to see if the installer is supported for your OS

You will have to meet the prerequisites for a successful installation

Get the mapr-setup sctipt from the MapR repository.

wget http://package.mapr.com/releases/installer/mapr-setup.sh

Run the mapr-setup script to start the installation.

bash ./mapr-setup.sh -y

Open the web UI with the following URL

https://<Installer node hostname/IPaddress>:9443

Following the prompts and you will be on your way to installing MapR.

There is also manual installation available. Full instructions can be viewed here.

Extensive documentation can be found on MapR's documentation site. http://maprdocs.mapr.com/home/



The Stackoverflow tag [mapr] can be used for questions about issues you have with the MapR platform.

381 questions
3
votes
0 answers

Facing an issue in submitting a mapreduce job with mapr cluster

Below are my Mapr Cluster (Non secure) configurations. MapR version - 6.1 Os - Ubuntu 16.04 Hadoop version - 2.7.0 Nodes - Single node core-site.xml:
Raja D
  • 31
  • 2
3
votes
1 answer

maprdb find_by_condition in python throws exception - Class com.mapr.db.Condition$Op not found

I am using python binding for maprdb. While all other interfaces are working as expected, I am having difficulty using "find_by_condition" interface. Here is the sample I tried :- import maprdb condition = {"col1": "col_value"} enter code…
3
votes
1 answer

log4j.properties file not found on classpath or ignored

I want to log in maprDB a spark job with log4j. I have written a custom appender, and here my log4j.properties : log4j.rootLogger=INFO, stdout log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.Target=System.out …
Franck Cussac
  • 310
  • 1
  • 3
  • 14
3
votes
1 answer

MapR Stream and PySpark

Does PySpark work (compatible) for MapR Streams? Any example code? I've tried that but keep getting exception strLoc = '/Path1:Stream1' protocol = 'file://' if ( strLoc.startswith('/') or strLoc.startswith('\\') ) else '' from…
Robot-43
  • 31
  • 2
3
votes
2 answers

What are the differences between Kafka and MapR streams from coding perspective?

What are the differences between Kafka and MapR streams from coding perspective? I need to implement the MapR streams in future but currently I have only access to Kafka. So exploring the Kafka right now is useful? So that I can easily pick up on…
Raj UK
  • 39
  • 1
  • 7
3
votes
1 answer

Hadoop dfsadmin -report command is not working in mapr

I need to know the dfs report of the mapr cluster but when i am executing following command i am getting error hadoop dfsadmin -report DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for…
Vikas Hardia
  • 2,635
  • 5
  • 34
  • 53
3
votes
1 answer

How to prevent a Hadoop job to fail when directory is empty?

I have a job that fails when there is no files in the input directory. The exception i get is the following: org.apache.hadoop.mapreduce.lib.input.InvalidInputException:Input Pattern maprfs:/profile/* I know this exception is coming from the…
danilo
  • 834
  • 9
  • 25
3
votes
3 answers

Difference Between typical Hadoop Architecture and MapR architecture

I know that Hadoop is based on Master/Slave architecture HDFS works with NameNodes and DataNodes and MapReduce works with jobtrackers and Tasktrackers But I can't find all these services on MapR, I find out that it has its own Architecture with…
3
votes
1 answer

Sorting in MapReduce Hadoop

I have few basic questions in Hadoop MapReduce. Assume if 100 mappers were executed and zero reducer. Will it generate 100 files? All individual are sorted? Across all mapper output are sorted? Input for reducer is Key -> Values. For each key, all…
Nageswaran
  • 7,481
  • 14
  • 55
  • 74
3
votes
1 answer

MAPR -File Read and Write Process

I am not able to find a specific link that explains to me how the meta data is distributed in MAPR(File meta data). When I look at cloudera / hortonworks /apache hadoop I know the meta data is stored in namenode's memory which is then fetched to…
Garfield
  • 396
  • 6
  • 19
3
votes
1 answer

How to creating a MapFile with Spark and access it?

I am trying to create a MapFile from a Spark RDD, but can't find enough information. Here are my steps so far: I started with, rdd.saveAsNewAPIHadoopFile(....MapFileOutputFormat.class) which threw an Exception as the MapFiles must be sorted. So I…
Ioannis Deligiannis
  • 2,679
  • 5
  • 25
  • 48
3
votes
2 answers

Talend tHBASEConnection and tHBaseInput for MapR

I have access to an edge node to a MapR Hadoop cluster. I have an HBase table named /app/SubscriptionBillingPlatform/Matthew with some fake data. A scan of it in the hbase shell results in this: I have a very simple Talend Job that should scan the…
Matthew Moisen
  • 16,701
  • 27
  • 128
  • 231
3
votes
3 answers

Using Hive with Pig through HCatalog issue with TimeStamp datatype

In my dev box, I have MapR 3.0.2, Hive 0.11, HCatLog 0.4.1 & Pig 0.12. Am using HCatlog to read and write Hive tables from Pig (Pig Latin), using standard queries, A = LOAD 'dbname.tablename' USING org.apache.hcatalog.pig.HCatLoader(); My Hive…
RVandakar
  • 81
  • 1
  • 5
  • 16
3
votes
4 answers

MapR Architecture Vs Cloudera Architecture

I'm familiar with the infrastructure or architecture of Cloudera: Master Nodes include NameNode, SecondaryNameNode, JobTracker, and HMaster. Slave Nodes include DataNode, TaskTracker, and HRegionServer. Master nodes should all be on their own nodes…
Matthew Moisen
  • 16,701
  • 27
  • 128
  • 231
3
votes
0 answers

EMR bootstrap action to run Hue on Mapr M3

Is there some bootstrap script to get hue running on EMR MapR, unlike setting up using this guide http://doc.mapr.com/display/MapR/Configuring+Hue
Praveen R
  • 115
  • 2
  • 6
1
2
3
25 26