0

I am feeding a long list of inputs in a function that calls an API to retrieve data. My list is around 40.000 unique inputs. Currently, the function returns output every 1-2 seconds or so. Quick maths tells me that it would take over 10+ hrs before my function will be done. I therefore want to speed this process up, but have struggles finding a solution. I am quite a beginner, so threading/pooling is quite difficult for me. I hope someone is able to help me out here.

The function:

import quandl
import datetime
import numpy as np

quandl.ApiConfig.api_key = 'API key here'


def get_data(issue_date, stock_ticker):
    # Prepare var
    stock_ticker = "EOD/" + stock_ticker
    # Volatility
    date_1 = datetime.datetime.strptime(issue_date, "%d/%m/%Y")
    pricing_date = date_1 + datetime.timedelta(days=-40)  # -40 days of issue date
    volatility_date = date_1 + datetime.timedelta(days=-240)  # -240 days of issue date (-40,-240 range)

    # Check if code exists : if not -> return empty array
    try:
        stock = quandl.get(stock_ticker, start_date=volatility_date, end_date=pricing_date)  # get pricing data
    except quandl.errors.quandl_error.NotFoundError:
        return []

    daily_close = stock['Adj_Close'].pct_change()  # returns using adj.close
    stock_vola = np.std(daily_close) * np.sqrt(252)  # annualized volatility

    # Average price
    stock_pricing_date = date_1 + datetime.timedelta(days=-2)  # -2 days of issue date
    stock_pricing_date2 = date_1 + datetime.timedelta(days=-12)  # -12 days of issue date
    stock_price = quandl.get(stock_ticker, start_date=stock_pricing_date2, end_date=stock_pricing_date)
    stock_price_average = np.mean(stock_price['Adj_Close'])  # get average price

    # Amihuds Liquidity measure
    liquidity_pricing_date = date_1 + datetime.timedelta(days=-20)
    liquidity_pricing_date2 = date_1 + datetime.timedelta(days=-120)
    stock_data = quandl.get(stock_ticker, start_date=liquidity_pricing_date2, end_date=liquidity_pricing_date)
    p = np.array(stock_data['Adj_Close'])
    returns = np.array(stock_data['Adj_Close'].pct_change())
    dollar_volume = np.array(stock_data['Adj_Volume'] * p)
    illiq = (np.divide(returns, dollar_volume))
    print(np.nanmean(illiq))
    illiquidity_measure = np.nanmean(illiq, dtype=float) * (10 ** 6)  # multiply by 10^6 for expositional purposes
    return [stock_vola, stock_price_average, illiquidity_measure]

I then use a seperate script to select my csv file with the list with rows, each row containing the issue_date, stock_ticker

import function
import csv
import tkinter as tk
from tkinter import filedialog

# Open File Dialog

root = tk.Tk()
root.withdraw()

file_path = filedialog.askopenfilename()

# Load Spreadsheet data
f = open(file_path)

csv_f = csv.reader(f)
next(csv_f)

result_data = []

# Iterate
for row in csv_f:
    try:
       return_data = function.get_data(row[1], row[0])
       if len(return_data) != 0:
          # print(return_data)
          result_data_loc = [row[1], row[0]]
          result_data_loc.extend(return_data)
          result_data.append(result_data_loc)
    except AttributeError:
          print(row[0])
          print('\n\n')
          print(row[1])
          continue

if result_data is not None:
    with open('resuls.csv', mode='w', newline='') as result_file:
        csv_writer = csv.writer(result_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
        for result in result_data:
            # print(result)
            csv_writer.writerow(result)
else:
    print("No results found!")

It is quite messy, but like I mentioned before, I am definitely a beginner. Speeding this up would greatly help me.

Leon
  • 37
  • 1
  • 7
  • What is your question? – Code-Apprentice Mar 05 '20 at 22:54
  • How do I apply multi-processing/threading to my function? – Leon Mar 05 '20 at 22:57
  • @Code-Apprentice I have been reading threads about multi-threading and tried to apply it to my iteration code, but I get all kind of errors, like `TypeError: 'list' object is not callable` etc. – Leon Mar 05 '20 at 23:03
  • I suggest you debug your code to figure out why you get that error. See [this article](https://ericlippert.com/2014/03/05/how-to-debug-small-programs/) for some tips to help you get started. – Code-Apprentice Mar 05 '20 at 23:04

0 Answers0