3

I have gone through many questions asked over stackoverflow and many other sites but still didn't find any luck to resolve my issue.

We have scheduled around 35 jobs between 9:30-10 AM but sometime 3 to 5 jobs missed execution and after running missing jobs as Adhoc run system again starts working correctly from next day. This happen again after some days or weeks.

We are using quartz version 2.2.3 and spring batch version 4.2.0.RELEASE.

We have not overridden scheduler thread count because it's working perfectly till long time and suddenly start failing for some jobs intermittently.

Below are quartz properties,

<property name="quartzProperties">
    <props>
        <prop key="org.quartz.scheduler.skipUpdateCheck">true</prop>
        <prop key="org.quartz.jobStore.class">org.quartz.impl.jdbcjobstore.JobStoreTX</prop>
        <prop key="org.quartz.jobStore.driverDelegateClass">org.quartz.impl.jdbcjobstore.StdJDBCDelegate</prop>
        <prop key="org.quartz.scheduler.instanceId">AUTO</prop>
        <prop key="org.quartz.jobStore.useProperties">false</prop>
        <prop key="org.quartz.jobStore.tablePrefix">#{'${db.defaultschema}' != '' ? '${db.defaultschema}'+'.QRTZ_' : 'QRTZ_'}</prop>
        <prop key="org.quartz.jobStore.selectWithLockSQL">SELECT * FROM {0}LOCKS UPDLOCK WHERE LOCK_NAME = ?</prop>
        <prop key="org.quartz.jobStore.isClustered">true</prop>
        <prop key="org.quartz.jobStore.dataSource">dataSource</prop>
        <prop key="org.quartz.jobStore.driverDelegateClass">org.quartz.impl.jdbcjobstore.oracle.OracleDelegate
        </prop>
    </props>
</property>

Spring batch job config:

<batch:job id="reportJob">
    <batch:step id="step1">
        <batch:tasklet>
            <batch:chunk reader="reports-reader" processor="reports-processor"
                writer="reports-writer" commit-interval="0">
            </batch:chunk>
        </batch:tasklet>
    </batch:step>
    <batch:listeners>
        <batch:listener ref="batchJobListener" />
    </batch:listeners>
</batch:job>
<bean id="reports-reader" scope="step"
    class="com.company.reportloader.reader.ReportsItemReader">
    <property name="reportsItemReaderService" ref="reportsItemReaderService"></property>
</bean>

<bean id="reports-processor" class="com.company.reportloader.processor.ReportsItemProcessor"></bean>
<bean id="reports-writer" class="com.company.reportloader.writer.ReportsItemWriter">
</bean>

Overriding executeInternal of QuartzJobBean and creating jobParameters to invoke spring batch job as below,

@Override
protected void executeInternal(JobExecutionContext context) throws JobExecutionException {
  launcher.run(job, jobParameters);
}

Any help or pointer will be great help.

Vivek
  • 376
  • 1
  • 14
  • Please let me know how the spring batch jobs are configured or triggered . Above configuration not saying this. – Rakesh Dec 25 '20 at 05:21
  • It's not always failing but intermittently some triggers missed in quartz to invoke jobs – Vivek Dec 28 '20 at 02:12
  • I found https://github.com/quartznet/quartznet/issues/735, could someone please help me in understanding the root cause. Is it quartz issue? – Vivek Jan 11 '21 at 05:33
  • We are running quartz scheduler in cluster env. with 3-4 scheduler nodes. It's not replicated in any other env (UAT, local, QA etc) even with 3000 jobs running at same time and decreasing server memory. Any help will be appreciated for resolving the issue. – Vivek Jan 18 '21 at 05:38
  • I also viewed https://stackoverflow.com/questions/618265/quartz-scheduler-suddenly-stop-running-and-no-exception-error but still didn't find the root cause of it. Any suggestions. – Vivek Jan 20 '21 at 05:30

1 Answers1

2

We had a code issue where we updated misfire instruction to 2 in job edit functionality. We resolved the issue by setting misfire_instr in qrtz_triggers table to 0. Somehow scheduler is considering few jobs as mis-fired due to which, jobs didn't trigger at scheduled time. For cron-trigger below are definition of misfire instruction,

**smart policy - default**  See: withMisfireHandlingInstructionFireAndProceed

**withMisfireHandlingInstructionIgnoreMisfires**
MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICYQTZ-283    All misfired executions are 
immediately executed, then the trigger runs back on schedule.
Example scenario: the executions scheduled at 9 and 10 AM are executed immediately. 
The next scheduled execution (at 11 AM) runs on time.

**withMisfireHandlingInstructionFireAndProceed**
MISFIRE_INSTRUCTION_FIRE_ONCE_NOW   Immediately executes first misfired execution and 
discards other (i.e. all misfired executions are merged together). Then back to 
schedule. No matter how many trigger executions were missed, only single immediate 
execution is performed.
Example scenario: the executions scheduled at 9 and 10 AM are merged and executed only 
once (in other words: the execution scheduled at 10 AM is discarded). The next 
scheduled execution (at 11 AM) runs on time.

**withMisfireHandlingInstructionDoNothing**
MISFIRE_INSTRUCTION_DO_NOTHING    All misfired executions are discarded, the scheduler 
simply waits for next scheduled time.
Example scenario: the executions scheduled at 9 and 10 AM are discarded, so basically 
nothing happens. The next scheduled execution (at 11 AM) runs on time.

After updating misfire_instr to 0, due to smart policy (default), quartz kick-off the mis-fired jobs within 3-5 mins once load reduces.

Vivek
  • 376
  • 1
  • 14