21

I have an embedded board with PowerPC 5200 running Realtime Linux with version 2.6.33.

My application is using one high-resolution timer in Linux for alarms. This timer sometimes didn’t expire. The problem happens very rarely, it may go many months between each time it happens on a system.

The timer is set by function timer_settime with absolute time. I have done some interesting observations when the timer didn’t expire:

  • Function timer_gettime returns remaining time 1ns.
  • Active timers are checked by displaying file /proc/timer_list and the timer_list didn’t show this timer in the active timer list.

I have looked into the Linux source and found a possible scenario:

The function timer_gettime ends up in function common_timer_get (posix-timers.c). Function common_timer_get returns it_value.tv_nsec = 1 if timer is active and remaining time is <= 0. This means that the timer has counted down and the timer state must be 'enqueued' or 'callback'.

I suppose that it is in state 'callback', that means it is running in function __run_hrtimer (hrtimer.c). Function __run_hrtimer is calling function __remove_hrtimer that remove the timer from timer active list before it changes timer state from 'enqueued' to 'callback'.

Several functions are called in function __run_hrtimer between changing timer state to 'callback' and the end of the function where the state 'callback' is cleared. If it is hanging here, the function timer_gettime may return 1ns while the timer is not on the active list. Here it is calling several functions in Linux kernel and the callback function in the application.

I have checked the callback function in my application. It is signaling a semaphore and setting the timer again on the same thread. I can't see why that should not work.

Is there someone that has seen a similar case?

Is there someone that has an idea of what is going wrong here?

Nmk
  • 1,281
  • 2
  • 14
  • 25
Vijay Katoch
  • 548
  • 1
  • 6
  • 14

0 Answers0