0

We are using MS SCOM to monitor result of ms sql agent jobs.

Currently we get error if any step of job failed.

Developers says that it is wrong and we should get alert only if whole job fails, even if some steps not working

Ok

I create test job with 1 step which will result error.

test job

Lets find job and results:

SELECT
    j.name,
    j.job_id,
    jh.step_id,
    jh.step_name,
    jh.run_status,

    CASE jh.run_status
        WHEN 0 THEN 'Failed'
        WHEN 1 THEN 'Succeeded'
        WHEN 2 THEN 'Retry'
        WHEN 3 THEN 'Canceled'
        WHEN 4 THEN 'Running'
        ELSE 'Unknown'
    END AS run_statusstring,
    j.enabled,
    jh.message,
    msdb.dbo.agent_datetime(
    CASE WHEN jh.run_date = 0 THEN NULL ELSE jh.run_date END,
    CASE WHEN jh.run_time = 0 THEN NULL ELSE jh.run_time END) AS last_runtime
FROM msdb.dbo.sysjobhistory jh
INNER JOIN (
    SELECT DISTINCT jh.job_id, MAX(jh.instance_id) instance_id
    FROM msdb.dbo.sysjobhistory jh
    WHERE jh.step_id != 0
    GROUP BY jh.job_id
) AS a
ON jh.job_id = a.job_id
    AND a.instance_id = jh.instance_id
INNER JOIN msdb.dbo.sysjobs AS j
ON jh.job_id = j.job_id

Anyway run_status=0 (failed)

Why? I am not a DBA, but where is logic? Why 'Quit the job..' not working?

Igor Kuznetsov
  • 421
  • 1
  • 6
  • 15

1 Answers1

0

I ran into the same problem and found that run_status of a step (step_id <> 0) is logged as a failure but run_status of job itself (step_id = 0) is logged as a success in case like yours.