What are the differences between Kafka and MapR streams from coding perspective?

Question

What are the differences between Kafka and MapR streams from coding perspective? I need to implement the MapR streams in future but currently I have only access to Kafka. So exploring the Kafka right now is useful? So that I can easily pick up on MapR streams once I get the access?

score 3 · Answer 1 · answered Sep 02 '16 at 00:54

I haven't used MapR Streams (since it is not open source), but my understanding is that they cloned the Kafka 0.9 Java API. So, if you are using Kafka 0.9 clients, it should be pretty similar (but you need to use their client, not Apache's).

In addition, note that clients in other languages will not be available. And other Apache projects that use different APIs (notably Spark Streaming) will require special MapR compatible versions.

score 3 · Accepted Answer · answered Sep 02 '16 at 04:11

As such there is no big difference in Kafka and MapR Stream API in terms of coding.

But there are some differences in terms of configuration and API arguments:

Kafka supports Receiver and Direct both approaches, but MapR stream supports only Direct approach.
The offset reset configuration value for reading the data from start, is smallest in Kafka, but in MapR Stream it is earliest.
The Kafka API supports for passing the Key and Value deserializer arguments in method, but in MapR stream API you have to configure them in Kafka params map against key.deserializer and value.deserializer keys.

Example of Direct approach for Kafka and MapR Stream API calls to receive the DStream:

Kafka API:

// setting the topic.
HashSet<String> topicsSet = new HashSet<String>(Arrays.asList("myTopic"));

// setting the broker list.
Map<String, String> kafkaParams = new HashMap<String, String>();
kafkaParams.put("metadata.broker.list", "localhost:9092");

// To read the messages from start.
kafkaParams.put("auto.offset.reset", "smallest");

// creating the DStream
JavaPairInputDStream<byte[], byte[]> kafkaStream = KafkaUtils.createDirectStream(streamingContext, byte[].class, byte[].class, DefaultDecoder.class, DefaultDecoder.class, kafkaParams, topicsSet);

MapR Stream API:

// setting the topic.
HashSet<String> topicsSet = new HashSet<String>(Arrays.asList("myTopic"));

// setting the broker list.
Map<String, String> kafkaParams = new HashMap<String, String>(); 
kafkaParams.put("metadata.broker.list", "localhost:9092"); 

// To read the messages from start.
kafkaParams.put("auto.offset.reset", "earliest");

// setting up the key and value deserializer
kafkaParams.put("key.deserializer", StringDeserializer.class.getName());
kafkaParams.put("value.deserializer", ByteArrayDeserializer.class.getName()); 

// creating the DStream
JavaPairInputDStream<byte[], byte[]> kafkaStream = KafkaUtils.createDirectStream(streamingContext, byte[].class, byte[].class, kafkaParams, topicsSet);

I hope the above explanation help you in understanding the differences between Kafka and MapR Stream API's.

Thanks,
Hokam
www.streamanalytix.com

What are the differences between Kafka and MapR streams from coding perspective?

2 Answers2