0

I'd like to speed up the execution time of my function in python. I read that a good way to do this is using a Bisection or Hashtable method. Do you know how I can do this with this function?

from time import time
import csv

f = open('file.csv')
reader = csv.reader(f, delimiter=';')

def old(abi):
    first = True
    for row in reader:
        if first:
            first = False
            first_row = row
        else:
            if row[0] == abi:
                res = row
                res = dict(zip(first_row, res))
                break

@timing
def test2():
    for x in xrange(3000, 800000):
        old(str(x))

test2()

Thank you very much for help me ;)

1 Answers1

0

I suspect that your problem is I/O (rather than CPU) bound.

If it is indeed CPU bound, there is one thing you could try to improve performance: replace a forloop with a generator. This way the iteration would happen on the C side on CPython.

def old(path, abi):
    with open(path) as handle:
        r = csv.reader(handle, delimiter=";")
        header = next(r)
        try:
            result = next(row for row in r if row[0] == abi)
            return dict(zip(header, result))
        except StopIterations:
            return None  # Not found.
Sergei Lebedev
  • 2,659
  • 20
  • 23