-1

I'm trying to optimize the general load time of a web application written in python. My application uses a lot of modules, some of which might or might not be actually needed for a given request.

Since page load time is an important factor of the end-user perceived quality of a site, I'm trying to reduce the impact of loading possibly unnecessary modules - especially, trying to reduce the time (and memory) required to initialize globals that might not be needed at all.

Simply put, my goals are:

  1. To reduce module initialization time as much as possible (not CPU usage).
  2. To reduce memory taken by un-need global variables.

To illustrate, here's a trivial module example:

COMMON = set(('alice', 'has', 'cat', 'with', 'blue', 'eyes'))

It takes time to build set for COMMON - if COMMON will be not used, that's a waste of load time and memory.
Obviously for a single module/global, the cost is negligible, but what if you have 100 modules with 100 variables?

One approach to make this faster is to delay initialization like this:

__cache_common = None
def getCommon():
    global __cache_common
    # not use before
    if __cache_common is None:
        __cache_common = set(('alice', 'has', 'cat', 'with', 'blue', 'eyes'))
    # get cached value
    return __cache_common

It saves load time and memory, sacrificing some CPU.

I've tried a few other techniques (see below), two of which are a bit faster than the simple caching above.

Is there another technique I could use to further reduce load time for modules and globals that might not be used on a given request?


Approaches I have tried so far, requires Python 2.6+:

from timeit import Timer

__repeat = 1000000
__cache = None

def getCache():
    return __cache

def getCacheTest():
    for i in range(__repeat):
        getCache()

def getLocal():
    return set(('alice', 'has', 'cat', 'with', 'blue', 'eyes'))

def getLocalTest():
    for i in range(__repeat):
        getLocal()

def getLazyIf():
    global __cache
    if __cache is None:
        __cache = getLocal()
    return __cache

def getLazyIfTest():
    for i in range(__repeat):
        getLazyIf()

def __realLazy():
    return __cache

def getLazyDynamic():
    global __cache, getLazyDynamic
    __cache = getLocal()
    getLazyDynamic = __realLazy
    return __cache

def getLazyDynamicTest():
    for i in range(__repeat):
        getLazyDynamic()

def getLazyDynamic2():
    global __cache, getLazyDynamic2
    __cache = getLocal()
    def __realLazy2():
        return __cache
    getLazyDynamic2 = __realLazy2
    return __cache

def getLazyDynamic2Test():
    for i in range(__repeat):
        getLazyDynamic2()

print sum(Timer(getCacheTest).repeat(3, 1)), getCacheTest, 'raw access'
print sum(Timer(getLocalTest).repeat(3, 1)), getLocalTest, 'repeat'
print sum(Timer(getLazyIfTest).repeat(3, 1)), getLazyIfTest, 'conditional'
print sum(Timer(getLazyDynamicTest).repeat(3, 1)), getLazyDynamicTest, 'hook'
print sum(Timer(getLazyDynamic2Test).repeat(3, 1)), getLazyDynamic2Test, 'scope hook'

With Python 2.7, I get these timings (the best is hook without scope):

1.01902420559 <function getCacheTest at 0x012AE170> raw access
5.40701374057 <function getLocalTest at 0x012AE1F0> repeat
1.39493902158 <function getLazyIfTest at 0x012AE270> conditional
1.06692051643 <function getLazyDynamicTest at 0x012AE330> hook
1.15909591862 <function getLazyDynamic2Test at 0x012AE3B0> scope hook
Mat
  • 202,337
  • 40
  • 393
  • 406
Chameleon
  • 9,722
  • 16
  • 65
  • 127

1 Answers1

1

An import statement executes the module, so you shouldn't be going around changing its semantics.

How about you just tuck your import statements inside the functions or methods that need them? That way they'll only happen when they're needed, not at application startup.

Ditto for the globals-- turn them into class statics or something. Having lots of globals is bad style anyway.

But why is this even a problem? Are you really including so many modules that simply finding them slows things down, or are some of the included packages doing a lot of costly initialization (e.g., opening connections)? My money is on the second. If you have written the modules responsible for the slow-down, look into wrapping the initialization into appropriate constructors.

alexis
  • 48,685
  • 16
  • 101
  • 161
  • Doable, but imho a horrible practice if done in general - makes the code much harder to understand if all imports are dispersed through the whole file. – Voo Mar 05 '12 at 21:22
  • 1
    It depends; the flip side is encapsulation of dependencies. Knowing that all sql interaction is restricted to a single module, for example, helps you understand the code too. Or seeing `from foo import bletch` right before bletch is used. Anyway you can go halfway: Encapsulate imports a bit, but not into every function and method-- I was exaggerating a little. – alexis Mar 05 '12 at 22:18
  • It not the best solution to import module where is used for frequent functions but it works for single call functions (it is simple). **Globals in modules** is not "as bad" since each module is separated. Since I am experience programmer I am not follow generalization like **globals are bad always** it is true if there is not separation (i.e. in Basic) - whatever each object has "globals" too - it more separated but still it is shared. – Chameleon Mar 07 '12 at 19:23
  • Keep in mind that python does not actually re-read a module that's already been loaded. Subsequent imports are a simple dictionary look-up. Still I'm with you, I'm reluctant to put "import re" inside a tight loop. Just superstitious, maybe. – alexis Mar 07 '12 at 22:27