Sort tuples instead.
tuples = [(d['index'], d['value'])
for d in array]
tuples.sort()
You didn't post any timeit data.
Show us representative data,
and an actual timing,
and then describe what kind of revised timing would be acceptable.
It's not clear that you can beat timsort,
though certainly the lambda overhead will be significant.
If you need faster still, strip out the irrelevant value
attribute:
indices = [d['index']
for d in array]
indices.sort()
Several elapsed times matter:
- time to create list
- time to sort list
- time to use sorted list
As stated, your question is underspecified,
since it does not constrain (1.) or (3.),
and we all know there are lies, damn lies, and micro benchmarks.
The initial (semi-sorted) order,
the distribution of values,
and the access pattern against the sorted list all matter
for the final elapsed time.
Some problems need only a subset of the full python3 semantics,
and are amenable to numba optimization.
You haven't told us enough
to tell whether that's applicable to your business problem.
EDIT
Timsort on a modern platform can easily sort 4 million items per second
in the tuple form,
somewhat less than that if lambda
overhead is necessary.
You didn't post timing data.
You described a requirement to sort 700 K items per second
on unknown hardware,
and asserted that the posted code wasn't capable of that.
The posted code offered indices in sequential (sorted) order,
which seemed odd, but I reproduced that aspect for tuple sorting
in the code below.
Here is what I'm running, on a 2.9 GHz intel core i7 mac laptop:
#! /usr/bin/env python
from time import time
import random
def elapsed(fn):
def print_elapsed(*args, **kw):
t0 = time()
ret = fn(*args, **kw)
print(fn.__name__, '%.3f sec' % (time() - t0))
return ret
return print_elapsed
@elapsed
def get_values(k=2_000_000, base_val=42):
return [dict(index=random.randint(0, 3e6), value=i + base_val + i % 10)
for i in range(k)]
@elapsed
def get_tuples(dicts):
return [(d['index'], d['value'])
for d in dicts]
@elapsed
def get_indices(dicts):
return [d['index']
for d in dicts]
@elapsed
def sort_dicts(dicts):
dicts.sort(key=lambda x: x['index'])
@elapsed
def sort_values(x, reverse=False):
x.sort(reverse=reverse)
if __name__ == '__main__':
dicts = get_values()
sort_dicts(dicts)
tuples = get_tuples(dicts)
sort_values(tuples)
indices = get_indices(dicts)
sort_values(indices)
Output for 2 M items:
get_values 3.307 sec
sort_dicts 2.121 sec
get_tuples 1.355 sec
sort_values 0.414 sec
get_indices 0.715 sec
sort_values 0.329 sec
Reducing the problem size down to your stated 20 K items,
get_values 0.034 sec
sort_dicts 0.006 sec
get_tuples 0.005 sec
sort_values 0.001 sec
get_indices 0.002 sec
sort_values 0.001 sec
or even for ten times larger 200 K items which encounters cache misses:
get_values 0.325 sec
sort_dicts 0.105 sec
get_tuples 0.111 sec
sort_values 0.027 sec
get_indices 0.064 sec
sort_values 0.021 sec
it is hard to see how you could be encountering the slowness you describe.
There must be some unseen aspect to the problem:
you are running on a slow clock rate CPU,
or at some level the target host's cache is small,
or DRAM is slow,
or there is another aspect to the data you're sorting that you have not yet revealed to us.
The "populated with lists" part of your question is not apparent in the code you posted.
You have not yet addressed whether techniques like cython or numba
are relevant to your business problem.
Maybe you do have a "slow sorting" technical issue,
but what you have shared with us does not yet offer evidence of that.