2

I have a program that scrapes a website for data. I want to be able to cache that data instead of loading it if its only been a few minutes since it was last retrieved. I looked at beaker but I'm extremely new to cache and not sure if this is what i need. I also do not really understand what the Cachemanager is and why i only use "cache.get" instead of using both "cache.set" and "cache.get". I have included the script that i have been using to test with.

from beaker.cache import CacheManager
from beaker.util import parse_cache_config_options
import sched, time
from datetime import datetime

cache_opts = {
             'cache.type': 'file',
             'cache.data_dir': '../Beaker/tmp/cache/data',
             'cache.lock_dir': '../Beaker/tmp/cache/lock'
             }

cache = CacheManager(**parse_cache_config_options(cache_opts))
tmpl_cache = cache.get_cache('mytemplate', type='file', expire=5)

def get_results():
    # do something to retrieve data
    print 'hey'
    data = datetime.now()
    return data

def get_results2():
    return 'askdjfla;j'

s = sched.scheduler(time.time, time.sleep)
def get_time(sc):     
    results = tmpl_cache.get(key='gophers', createfunc=get_results)    
    results2 = tmpl_cache.get(key='hank', createfunc=get_results2)   
    print results,results2
    sc.enter(1, 1, get_time, (sc,))

s.enter(1, 1, get_time, (s,))
s.run()

Am i going about this the right way?

Lennart Regebro
  • 167,292
  • 41
  • 224
  • 251
Wallace
  • 572
  • 3
  • 15

1 Answers1

2

You are using only cache.get, and that is correct, because if it isn't found in the cache, it will call the function to create it. This becomes clearer and easier if you instead use the decorator API:

@cache.cache('gophers', expire=3600)
def get_results():
    # do something to retrieve data
    print 'hey'
    data = datetime.now()
    return data

@cache.cache('hank', expire=3600)
def get_results2():
    return 'askdjfla;j'

s = sched.scheduler(time.time, time.sleep)
def get_time(sc):     
    results = get_results()
    results2 = get_results2()
    print results,results2
    sc.enter(1, 1, get_time, (sc,))
Lennart Regebro
  • 167,292
  • 41
  • 224
  • 251
  • Yeah that helps a lot. I have another question though, how do i know what type to use in the configure options. When do you use "dbm" vs "file" vs "memory". – Wallace Apr 30 '13 at 14:01
  • @Wallace: dbm vs file I don't know. Memory is if you don't want to store it for a long time, and have enough memory. – Lennart Regebro Apr 30 '13 at 19:06
  • File and memory will be sufficient for my needs and I think now have a good enough understanding of how to implement each. I was having a hard time understanding the basics of beaker. They only briefly discuss how to use it in their documentation so thank youfor your help! – Wallace Apr 30 '13 at 23:11