I am working on a project for Algo trading using zerodha broker's API. I am trying to do multithreading to save the costly operation of calling the API function for getting historical data for 50 stocks at a time and then apply my strategy on it for buy/sell.
Here is my code:
- historical data function:
def historicData(token, start_dt, end_dt):
data = pd.DataFrame( kite.historical_data( token,
start_dt,
end_dt,
interval = 'minute',
continuous = False,
oi = False
)
)
# kite.historical_data() is the API call with limitation of 20 requests/sec
return data.tail(5)
- Strategy function:
def Strategy(token, script_name):
start_dt = (datetime.now() - timedelta(3)).strftime("%Y-%m-%d")
end_dt = datetime.now().strftime("%Y-%m-%d")
ScriptData = historicData(token, start_dt, end_dt)
# perform operations on ScriptData
print(token,script_name)
- concurrent calling of the above function:
# concurrent code
from threading import Thread, Lock
start_dt = (datetime.now() - timedelta(3)).strftime("%Y-%m-%d")
end_dt = datetime.now().strftime("%Y-%m-%d")
th_list = [None]*10
start = sleep.time()
for i in range(0,50,20): # trying to send 20 request in a form of 20 threads in one go
token_batch = tokens[i:i+20] # data inside is ['123414','124124',...] total 50
script_batch = scripts[i:i+20] # data inside is ['RELIANCE','INFY',...] total 50
j=0
for stock_script in zip(token_batch,script_batch):
th_list[j] = Thread( target = Strategy,
args = ( stock_script[0],
stock_script[1]
)
)
th_list[j].start()
j+=1
end = sleep.time()
print('time is : ', end-start)
Now there are 2 issues I am unable to resolve after 2 days of trying many solutions online.
- There's a bottleneck from the API server that it accepts 20 API calls per second and rejects if more are called. Total stocks in the list are 50 and I am trying to do is get data of 20 stocks at a time then get another 20 in the next and then remaining 10 in the third go. Total stocks list is gonna get big with 200 stocks soon that's why a serial execution is too costly for my strategy to work.
- When running this function concurrently, there are too many threads created at once and API request exceeds... and
print('time is : ', end-start)
runs as soon as I run the 3rd cell.
So how do I block the code from leaving the inner for loop before all threads finish their execution.
and
Is my way correct to get 20 threads at the most per second? Should I place a sleep(1)
somewhere?