2

We are running a 1 namenode and 3 datanode cluster on top of Azure. On top of this I am running my spark job on Yarn-Cluster mode.

Also, We are using HDP 2.5 which have spark 1.6.2 integrated into its setup. Now I have this very weird issue where the Processing time of my job suddenly increases to 4s.

This has happened quite some times but does not follow a pattern, sometimes the 4s waiting time is from the start of the job or may be at the middle of the job as shown below.

Sudden Increase to 4s

One thing to notice is that I have no events coming in which is processed so technically the processing time should stay almost the same. Also, my spark streaming job has a batch duration of 1s so it can't be that.

I dont have any error in the logs or anywhere and I am being lost to process this issue.

Minor details about the job:

I am reading messages over kafka topic and then storing them within Hbase tables using Phoenix JDBC Connector.

EDIT: More Information

In the InsertTransactionsPerRDDPartitions, I am performing connection open and write operation to HBase using Phoenix JDBC connectivity.

updatedEventLinks.foreachRDD(rdd -> {
  if(!rdd.isEmpty()) {
  rdd.foreachPartition(new InsertTransactionsPerRDDPartitions(this.prop));
  rdd.foreachPartition(new DoSomethingElse(this.kafkaPublishingProps, this.prop));
 }
});
Biplob Biswas
  • 1,761
  • 19
  • 33
  • do you have an `id` or something to uniquely identify these events? If so, I would start with logging each step and how much it took to try and narrow the problem down for starters. – Eugene Apr 19 '17 at 14:05
  • 0 event doesn't mean the processing time should be short. E.g., you can open a connection to Hbase, write nothing, and close the connection. It may take several seconds. – zsxwing Apr 20 '17 at 00:39
  • @Eugene I have unique uuid's but as you can see no events are being processed at that time. – Biplob Biswas Apr 20 '17 at 08:05
  • @zsxwing I have updated my question with the functions where I am opening HBase connection. The thing is I am checking for empty rdd so it basically shouldn't go into the dessignated rdd.foreachpartition functions. And even if that was the case I still don't understand the sudden jump from 27 ms to 4s. – Biplob Biswas Apr 20 '17 at 08:07

0 Answers0