So i'm using insort from bisect to insert strings into a sorted list. It's somehow faster if i use the built in one. **by faster i mean, on average twice as fast (1 millisecond vs 2 millisecond on a 10,000 word list). I run through the unix cmd on a bigger list than the one i have below through:
time script.py < wordList.txt
I'm thinking it has to do with C, but i don't understand how or why. I want to make mine as fast, but without using the built in.
Here it is straight from the bisect source-code:
def insort_left(a, x, lo=0, hi=None):
"""Insert item x in list a, and keep it sorted assuming a is sorted.
If x is already in a, insert it to the left of the leftmost x.
Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.
"""
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if a[mid] < x: lo = mid+1
else: hi = mid
a.insert(lo, x)
This is the part that i think makes it different:
# Overwrite above definitions with a fast C implementation
try:
from _bisect import *
except ImportError:
pass
Here is a list of input:
#wordlist = ["hello", "jupiter", "albacore", "shrimp", "axe", "china", "lance", "peter", "sam", "uncle", "jeniffer", "alex", "fer", "joseph"]
Some code to make it work:
sorted_list = []
for line in wordlist:
insort_left(sorted_list, line)
print(sorted_list)
So my concern is implementing a C based insort in python without using modules. How can i do this?