I recently wrote some code to substantially improve the scipy.stats.binom_test method. Basically, the function was creating an array of the size of the inputs and this was causing memory errors when the inputs were of the order of 100 million. Creating these arrays was unnecessary and an artifact of porting the method from R. I modified this logic in the following PR: https://github.com/ryu577/scipy/pull/1/files.
To see how this unnecessary creation of arrays causes issues, run the following code:
from scipy.stats import binom_test
binom_test(100000000,100000001,.5)
Here, I replaced the searching for the value in an array with binary search. This makes the method much more memory and time efficient. This takes the method from being un-usable for inputs sized hundreds of millions to running in a blink of an eye with no memory overhead at all.
I tested usage and it produces the same output as the original version in a variety of contexts.
However, this PR has not been getting any attention. I even sent an email about it to the scipy mailing list and got no response.
I'm committing to do whatever it takes to get this change into scipy, but am lost as to the next steps. Is there anyone who has contributed to scipy that can guide me?