Questions tagged [mapr]

MapR is a commercial data platform that offers a HDFS compatible distributed file system, a database that allows to store data in BigTable or JSON and a streaming platform for messaging. MapR leverages APIs from open source tools such as Hadoop, Kafka, HBase and provides a proprietary implementation written in C optimised for improved performance.

MapR is a complete enterprise-grade distribution for Apache Hadoop. The MapR Converged Data Platform has been engineered to improve Hadoop’s reliability, performance, and ease of use.

The MapR distribution provides a full Hadoop stack that includes the MapR File System (MapR-FS), the MapR-DB NoSQL database management system, MapR Streams, the MapR Control System (MCS) user interface, and a full family of Hadoop ecosystem projects. You can use MapR with Apache Hadoop, HDFS, and MapReduce APIs.

MapR supports the Hadoop 2.x architecture and YARN (Yet Another Resource Negotiator). Hadoop 2.x and YARN make up a resource management and scheduling framework that distributes resource management and job management duties.

enter image description here

There are three MapR editions.

  • MapR Community Edition (formerly M3)
    • Free community edition.
  • MapR Enterprise Edition (formerly M5)
    • Adds high availability and data protection, including multi-node NFS.
  • MapR Enterprise Database Edition (formerly M7)
    • Adds structured table data natively in the storage layer and provides a flexible NoSQL database.

MapR can be installed on many versions of Red Hat Enterprise linux, CentOS, Ubuntu, Oracle Linux, and SUSE. A full matrix of supported Linux operating systems can be found here.

To install MapR the following requirements are needed.

  • A 64-bit CPU.
  • One of the above mentioned operating systems. (Red Hat Enterprise linux, CentOS, Ubuntu, Oracle Linux, or SUSE)
  • A minimum of 8GB of RAM.
  • At least one single unformatted disk.
  • A Resolvable hostname.
  • A common user on each server you wish to install MapR on.
  • Java 1.7.0 or higher.
  • Other
    • NTP, Syslog, PAM



Try MapR

Download the MapR Sandbox for VMware or Virtualbox for free.

OR

Install MapR on your own. Check to see if the installer is supported for your OS

You will have to meet the prerequisites for a successful installation

Get the mapr-setup sctipt from the MapR repository.

wget http://package.mapr.com/releases/installer/mapr-setup.sh

Run the mapr-setup script to start the installation.

bash ./mapr-setup.sh -y

Open the web UI with the following URL

https://<Installer node hostname/IPaddress>:9443

Following the prompts and you will be on your way to installing MapR.

There is also manual installation available. Full instructions can be viewed here.

Extensive documentation can be found on MapR's documentation site. http://maprdocs.mapr.com/home/



The Stackoverflow tag [mapr] can be used for questions about issues you have with the MapR platform.

381 questions
1
vote
1 answer

Batch Size Problem with MapR Streams Kafka API

Hello i am using Kafka MapRStream to recieve Events from a Mapr Streams Topic. I am trying to increase the batch size of my consumer but i am not getting more than 30 messages in one batch! A single event is about 5000 bytes in size. If the event is…
Bortallo
  • 11
  • 3
1
vote
0 answers

Kafka Consumer polls fewer messages when the messages are larger (MapR Streams)

We are facing an issue while running a Kafka consumer (Java). The poll returns fewer messages when the messages are larger. We have tried increasing different config parameters while constructing the consumer such as fetch.min.bytes,…
Vihit Shah
  • 314
  • 1
  • 5
1
vote
1 answer

Unable to build Flink from sources due to MapR artifacts problems

Summary: mapr dependency could not be found and thus the Flink build on master branch fails. Failed to execute goal on project flink-mapr-fs: Could not resolve dependencies for project org.apache.flink:flink-mapr-fs:jar:1.10-SNAPSHOT: Failed to…
WestCoastProjects
  • 58,982
  • 91
  • 316
  • 560
1
vote
0 answers

Where to use the HDFS data in Web UI - MapR

I am able to connect to Mapr Control System - MCS - port 8080 (Web - UI) but where can i see the data file i copied from my local files system.?? On the MCS Navigation Pane , I can see the below tabs but i dont see where to navigate to data…
Rajesh
  • 65
  • 6
1
vote
1 answer

Query node-label topology from Yarn via REST API [MapR 6.1/Hadoop-2.7]

There is a Java and CLI-interface to query Yarn RM for node-to-nodelabel (and inverse) mappings. Is there a way to do this via the REST-API as well? An initial RM-API search revealed only node-label based job submissions as an option. Sadly that is…
Rick Moritz
  • 1,449
  • 12
  • 25
1
vote
0 answers

Apache Impala reports "missing disk id" inside query profile

When I launch any query on impala, I get the following message in the profile WARNING: The following tables have scan ranges with missing disk id information. I executed "compute stats" statement on tables but the warning is still present. Since I'm…
sioale
  • 77
  • 1
  • 2
  • 10
1
vote
1 answer

Hive problems of connection to port 10000

Currently we are running mapr three node cluster where hive is installed and we use it very frequently for analytics and reporting but due to many connection or some other reason hue(UI panel) show error that "Could not connect to cm:10000" and…
Devbrat Shukla
  • 504
  • 4
  • 11
1
vote
0 answers

Reclaiming tables corrupted when hdfs volume was at 100%

I am using hadoop version Hadoop 2.7.0-mapr-1506 . When data volume is at 100%, our jobs still tried to insert overwrite data to few hive tables and they are corrupted and gives the below exception when accessed, at…
Albin
  • 371
  • 1
  • 4
  • 18
1
vote
1 answer

Pyspark - DataFrame persist() errors out java.lang.OutOfMemoryError: GC overhead limit exceeded

Pyspark job fails when I try to persist a DataFrame that was created on a table of size ~270GB with error Exception in thread "yarn-scheduler-ask-am-thread-pool-9" java.lang.OutOfMemoryError: GC overhead limit exceeded This issue happens only…
Sam
  • 17
  • 5
1
vote
1 answer

Cannot access MapR with

sudo maprlogin generateticket -type service -user -duration 14:0:0 -out / returns the following error message. "Operation failed. User has no established credentials on the cluster: " I tried various…
Stefan Papp
  • 2,199
  • 1
  • 28
  • 54
1
vote
1 answer

SparkContext: Error initializing SparkContext on MapR Sandbox

I tried running this sample project which uses MapR. I tried executing the class ml.Flight in the sandbox and from the below line, val spark: SparkSession = SparkSession.builder().appName("churn").getOrCreate() I got this error. [user01@maprdemo…
user54321
  • 622
  • 6
  • 18
1
vote
1 answer

Apache Drill - Unable to connect to zk client

I am using Drill 1.13. When I start the drill instance using sqlline.bat -u "jdbc:drill:zk=local", I am able to get to the console and query the DB. However when I try accessing the drill DB via the application: using the jdbc driver…
Joash
  • 13
  • 5
1
vote
0 answers

What is the fastest way to move data from one volume to another with MapR?

I want to move data from one volume to another. The folders and file sizes vary. Files can be up to 100 GB, but we can have also a lot of small files. If there is data in the destination volume at that particular folder, it can be overwritten. So…
Stefan Papp
  • 2,199
  • 1
  • 28
  • 54
1
vote
1 answer

How to kill a running query in apache Impala 2.10 from a central point

Sometime, I have queries that are supposed to take only few seconds keeping running and running, and blocking other queries, or queries tweaked with a value set to MT_DOP too big which put impala on their knees. While this is possible to kill query…
Baptiste Mille-Mathias
  • 2,144
  • 4
  • 31
  • 37
1
vote
0 answers

anatomy of a file read and write in MapR-FS

I am trying to understand what is the anatomy of a file read and write in MapR-FS? I googled a lot but did not get clear understanding what are the steps of file read and write in MapR-FS. I also found this quetion MAPR -File Read and Write Process…
learner
  • 365
  • 1
  • 3
  • 16