I have 2 gearman admins with diffrent IP address, and I have 40 workers in two clients, each client have 20 workers on it.
Here is the problem, I check the status of 2 admins for every 10 minutes, and they have the same number of jobs at beginning, but their number drop in different speed, admin2 is 3 times faster than admin1.
This situation finally result in a problem, if 2 kinds of huge set of jobs, for example, job1 has 400000 jobs, job2 has 400000 jobs, and job2 is trigger a hour later than job1, then each of them will finished half of them first, because another half was holding tight in the hand of admin1, only if admin2 done dispatching can admin1 dispatch his jobs, and this is a disaster, because I want job1 finish, but not finish half of them and keep waiting for the half of other jobs to finish.
Asked
Active
Viewed 69 times
0

ivila
- 25
- 6
-
1So, they differ in speed. But what is the *problem*? – Klaus D. Aug 03 '17 at 05:48
1 Answers
0
So I found the reason. In gearman server, it would only send noop msg when a worker is marked sleeping and not is_noop_sent, and after sending the noop msg, the is_noop_sent was set false, and will only set true when get grab job command.
But in python gearman, it use a lock to control the recv loop, if it cannot get the lock, if will pass or send pre_sleep command -- and will lead to a result that the server never noop the worker any more.

ivila
- 25
- 6