Most efficient way to write spark streaming data into RDBMS

Asked Dec 21 '16 at 06:28

Active Dec 21 '16 at 14:01

Viewed 381 times

I am writing a spark streaming job that consumes data from Kafka & writes to RDBMS. I am currently stuck because I do not know which would be the most efficient way to store this streaming data into RDBMS.

On searching, I found a few methods -

Using DataFrame
Using JdbcRDD
Creating connection & PreparedStatement inside foreachPartition() of rdd and using PreparedStatement.insertBatch()

I can not figure out which one would be the most efficient method of achieving my goal.

Same is the case with storing & retrieving data from HBase.

Can anyone help me with this ?

edited Dec 21 '16 at 14:01

Ram Ghadiyaram

28,239
13
95
121

asked Dec 21 '16 at 06:28

ronojoy ghosh

have you tried any one of the above ? If not you can post.. – Ram Ghadiyaram Dec 21 '16 at 13:59
I have used DataFrame to write to RDBMS. Not yet went to the part where I test the performance though. – ronojoy ghosh Dec 26 '16 at 11:00

Most efficient way to write spark streaming data into RDBMS

0 Answers0