I am using SimPy to simulate a compute cluster. I want to implement a fair-share/proportional scheduling logic in Simpy, wherein the resource slots are proportionally distributed across remaining tasks based on their priority. E.g. if we submit two jobs with a priority of 100 and 10 tasks each, we expect these two jobs to finish roughly at the same time.
I have implemented a basic working version of this logic, however, it is not scalable, especially when we are expecting > 10million tasks/requests.
The main bottleneck is in the manipulation of the put_queue, which involves the following steps:
Selecting the next pending request to process
Deleting this request from the queue
Moving it to the head of the queue
I have included a dummy example below focusing on the main concerns in steps 2 and 3.
My questions are:
Without step 2. above, I get a RuntimeError with the following message
<Request() object at 0x1da7348a2e0> has already been triggered
, How can I let Simpy know to ignore the request if it has already been processed? This way, I can bypass step 2 and save on computational cost.Can I better implement steps 2 and 3 to make it scalable?
'''
import simpy
import random
def awesome_proportional_logic(task_queue):
request_to_process_next = random.choice(task_queue)
return request_to_process_next
def process_task(env, slots, task_duration=1):
with slots.request() as req:
yield req
print(f"Task started at {env.now}")
yield env.timeout(task_duration)
print(f"Task finished at {env.now}")
if slots.put_queue:
# Manipulation of put queue
# Step 1: Select next request to process
request_to_process_next = awesome_proportional_logic(slots.put_queue)
# Step 2: Remove it from existing queue - Takes very long
slots.put_queue.remove(request_to_process_next)
# Step 3: Add the selected request to head of the queue
slots.put_queue.insert(0, request_to_process_next)
if __name__=='__main__':
env = simpy.Environment()
slots = simpy.Resource(env, capacity = 2)
N_tasks = 100
for i in range(N_tasks):
env.process(process_task(env, slots))
env.run(until=100)'''