0

Cadence server version: 0.19.2

I have made following observation: I have a Job workflow that triggers encoding workflow (child workflow) which has an activity to handle encoding status. I have supplied retry configuration and heartbeat configuration in both child workflow and activity in case the workflow or activity fails due to server getting killed. However out of lets say 100 jobs, i get 20 jobs where the activity doesn't retries. It fails in attempt 0 with timeout type heartbeat timeout. I am sharing below parent workflow json and child workflow json and along with the some screen shots

Child workflow JSON http://jsonblob.com/1007526462865293312

Parent Workflow JSON http://jsonblob.com/1007526121943875584

Activity configuration Child workflow configuration

James
  • 4,211
  • 1
  • 18
  • 34
  • Also why childworkflow didnt retried when workflow execution failed – Tarun Singhal Aug 12 '22 at 06:05
  • @long can you please help with this question? I have the following suspects 1) possible bug in that version of cadence that got fixed later 2) There is some weird race condition happening when the server gets killed and when the cadence client kills workers, and this is solved by enabling auto heartbeat – Tarun Singhal Aug 16 '22 at 08:01
  • Could you share some more details from the child workflow by using the GRID view? There should be some more information about why it failed. It says it failed with HEARTBEAT timeout but I'm guessing activity timeout should have also passed so it wouldn't have retried. GRID view should tell more. – Ender Aug 22 '22 at 15:54

0 Answers0