Questions tagged [mapr]

MapR is a commercial data platform that offers a HDFS compatible distributed file system, a database that allows to store data in BigTable or JSON and a streaming platform for messaging. MapR leverages APIs from open source tools such as Hadoop, Kafka, HBase and provides a proprietary implementation written in C optimised for improved performance.

MapR is a complete enterprise-grade distribution for Apache Hadoop. The MapR Converged Data Platform has been engineered to improve Hadoop’s reliability, performance, and ease of use.

The MapR distribution provides a full Hadoop stack that includes the MapR File System (MapR-FS), the MapR-DB NoSQL database management system, MapR Streams, the MapR Control System (MCS) user interface, and a full family of Hadoop ecosystem projects. You can use MapR with Apache Hadoop, HDFS, and MapReduce APIs.

MapR supports the Hadoop 2.x architecture and YARN (Yet Another Resource Negotiator). Hadoop 2.x and YARN make up a resource management and scheduling framework that distributes resource management and job management duties.

There are three MapR editions.

MapR Community Edition (formerly M3)
- Free community edition.
MapR Enterprise Edition (formerly M5)
- Adds high availability and data protection, including multi-node NFS.
MapR Enterprise Database Edition (formerly M7)
- Adds structured table data natively in the storage layer and provides a flexible NoSQL database.

MapR can be installed on many versions of Red Hat Enterprise linux, CentOS, Ubuntu, Oracle Linux, and SUSE. A full matrix of supported Linux operating systems can be found here.

To install MapR the following requirements are needed.

A 64-bit CPU.
One of the above mentioned operating systems. (Red Hat Enterprise linux, CentOS, Ubuntu, Oracle Linux, or SUSE)
A minimum of 8GB of RAM.
At least one single unformatted disk.
A Resolvable hostname.
A common user on each server you wish to install MapR on.
Java 1.7.0 or higher.
Other
- NTP, Syslog, PAM

Try MapR

Download the MapR Sandbox for VMware or Virtualbox for free.

Install MapR on your own. Check to see if the installer is supported for your OS

You will have to meet the prerequisites for a successful installation

Get the mapr-setup sctipt from the MapR repository.

wget http://package.mapr.com/releases/installer/mapr-setup.sh

Run the mapr-setup script to start the installation.

bash ./mapr-setup.sh -y

Open the web UI with the following URL

https://<Installer node hostname/IPaddress>:9443

Following the prompts and you will be on your way to installing MapR.

There is also manual installation available. Full instructions can be viewed here.

Extensive documentation can be found on MapR's documentation site. http://maprdocs.mapr.com/home/

The Stackoverflow tag [mapr] can be used for questions about issues you have with the MapR platform.

381 questions

vote

0 answers

Dynamic output path for partitioned parquet files in Spark

We're using MapR FS with rolling volumes and there's a necessity to align partitioned output parquet files with corresponding volumes. df .write .partitionBy("year", "month", "day", "hour") …

asked Jun 25 '18 at 18:10

ChernikovP

vote

0 answers

Extract TDE file from Tableau server fails under MapR

I want to extract a TDE file via Java new Extract(fileName) but I get the following error message: Caused by: com.tableausoftware.TableauException: server did not call us back at com.tableausoftware.extract.Extract.(Unknown Source) I read…

java hadoop tableau-api mapr

asked Apr 19 '18 at 09:42

mbauhardt

vote

2 answers

Create temporary SparkSession with enableHiveSupport

I am working on connecting to data in Hadoop that allows dynamic data type connections. I need to be able to connect to Hive Thrift Server A, pull in some data, and then connect to Hive Thrift Server B and pull in more data. To my understanding…

scala hadoop apache-spark hive mapr

asked Feb 19 '18 at 20:15

Ryan

vote

1 answer

Java + Spark - temp folder not getting cleaned

We are using Spark + Java in our project, and the Hadoop distribution being used is MapR. In our Spark jobs we persist data (at disk level). After the job completes, there is lot of temp data inside the /tmp/ folder. How can we ensure that /tmp/…

java apache-spark mapr

asked Jan 19 '18 at 12:49

Anuj Mehra

vote

1 answer

Spark dataframe insertinto hive table fails since some of the staging part files created with username mapr

I am using Spark dataframe to insert into a hive table. Even though the application is being submitted using the username 'myuser', some of the hive staging part files gets created with username 'mapr'. So the final write into the hive table fails…

hadoop apache-spark hive mapr apache-spark-1.6

asked Dec 29 '17 at 10:31

Shasankar

vote

2 answers

multiple column in "IN" clause with Hive

does hive support query with multiple column in "IN" clause like below ? select * from address where (se10,ctry_nm) IN (44444444,"USA"); I am getting below error with this query - at…

hive mapr

asked Dec 27 '17 at 23:10

Rup

vote

1 answer

Spark Application Not reading log4j.properties present in Jar

I am using MapR5.2 - Spark version 2.1.0 And i am running my spark app jar in Yarn CLuster mode. I have tried all the available options that i found But unable to succeed. This is our Production environment. But i need that for my particular spark…

scala apache-spark hadoop-yarn mapr spark-submit

asked Dec 17 '17 at 22:39

AJm

vote

2 answers

Which node to edit hadoop .xml files on?

When editing hadoop .xml config files (eg. hdfs-site.xml), which node of the hadoop cluster should be the one used to edit the files? Ie. with a cluster of many nodes, all of them having a hadoop folder containing .xml and .properties files, which…

hadoop hue mapr

asked Dec 12 '17 at 19:43

lampShadesDrifter

3,925
8
40
102

vote

0 answers

Spark Executor Custom Logs

I've been supplying custom log4j properties to spark-submit in below manner: spark-submit --master yarn --queue qqqq \ --driver-java-options "-Dlog4j.configuration=file:/absolute path/to properties file/driver-log4j.properties" \ --conf…

scala apache-spark logging mapr

asked Dec 05 '17 at 23:10

user123

vote

0 answers

Stream data to Apache Phoenix using flume

When I am trying to stream data to Phoenix using flume I am getting the following error ERROR client.ZooKeeperSaslClient: Exception while trying to create SASL client java.security.PrivilegedActionException: javax.security.sasl.SaslException:…

hbase bigdata apache-zookeeper apache-phoenix mapr

asked Nov 15 '17 at 10:06

RATHIN RAVI

vote

0 answers

what is difference between Mapr nfs and HDFS nfs?

What is difference between Mapr nfs and HDFS nfs. My understanding is as following- Mapr nfs is read/write but HDFS nfs is read only. Mapr nfs don't use any intermediate file system but HDFS nfs stores file in an intermediate file system(ex-…

hdfs nfs mapr

asked Oct 13 '17 at 10:33

bittu

vote

1 answer

pyspark split load uniformly across all executors

I have a 5 node cluster.I am loading a 100k csv file to a dataframe using pyspark and performing some etl operations and writing the output to a parquet file. When I load the data frame how can divide the dataset uniformly across all executors os…

apache-spark pyspark cloudera hortonworks-data-platform mapr

asked Oct 06 '17 at 02:15

srini

vote

1 answer

Unable to start Hive CLI Hadoop(MapR)

I am trying to access hive CLI. However, it is failing to start with the following AccessControl issue. Strangly enough, I am able to query hive data from Hue without the AccessControl issue. However, hive CLI is not working. I am on a MapR cluster.…

hadoop hive mapr

asked Sep 24 '17 at 07:24

user2159301

vote

0 answers

mapr stream api causing java fatal crash

platform : MapR 5.2 on sandbox JAVA FATAL CRASH WHEN trying to write using producer { public static void configureProducer() { Properties props = new Properties(); props.put("acks", "all"); props.put("retries", 0); …

stream kafka-producer-api mapr

asked Sep 11 '17 at 16:51

Rishi Patil

vote

1 answer

How can I identify the Input Formats in MapReduce Program

I just started learning Hadoop and there are various formats of input types. I have few programs to study and my main question is how can I identify if the input format is TextInputFormat or KeyValueTextInputFormat or any other. Your help is really…

hadoop mapreduce hadoop2 mapr

asked Sep 07 '17 at 22:47

Harsh

Prev 1 2 3

…

25 26 Next