2

Short Version: In Django (ideally Django 1.7), I want to read certain database objects just once each day at 2 AM and cache them. Each time I do MyModel.objects.get() it will not touch the database, it will use the cache. Then I will manually refresh the cache at 2 AM. I do not want to change every single line that has a query like MyModel.objects.get() because I would need to change 100+ lines and it would get verbose.

Long Version:

I am working with a very large Django project that is old and not entirely sensible. Once each day (at 2 AM after backups, during scheduled downtime), the Django application reads a set of protected database objects (rows, model instances), checks them for consistency, and then it uses exactly those values until the next day. These database objects (rows) can only be changed during the scheduled downtime at 2 AM, and changing them in the middle of the day would corrupt data and make things confusing.

I already wrote a pre_save hook that errors out if you try to modify these database objects when it is not in scheduled downtime. I also gave very few people access to these database tables via the Django Admin Console.

But it is low performance, because it actually does hit the database every single time these objects are used, despite the fact that it should only hit the database once each day.

Sadly, there are over 100 lines of code that lookup and then use the tables, like this:

# In views.py:
from myapp.models import WidgetType

# In literally over 100 places:
widget_t1 = WidgetType.objects.get(pk=1)
local_var.part_num = widget_t1.part_num

Every single time it says WidgetType.objects.get, I want it to use a cached version of the database object. I only want to refresh the cache manually once per day at 2 AM, for performance and extra safety against data corruption.

I do not want to use the cache module every single time I do a database lookup, because that would require changing over 100 lines of code, and it would make the code more verbose.

SerMetAla
  • 4,332
  • 5
  • 31
  • 25

1 Answers1

2

django-cacheops gives you the option of automatic caching. Here is a sample configuration that automatically caches some DB reads and enables manual caching with .cache() for others:

CACHEOPS = {
    # Automatically cache any User.objects.get() calls for 15 minutes
    # This includes request.user or post.author access,
    # where Post.author is a foreign key to auth.User
    'auth.user': {'ops': 'get', 'timeout': 60*15},

    # Automatically cache all gets and queryset fetches
    # to other django.contrib.auth models for an hour
    'auth.*': {'ops': ('fetch', 'get'), 'timeout': 60*60},

    # Cache all queries to Permission
    # 'all' is just an alias for {'get', 'fetch', 'count', 'aggregate', 'exists'}
    'auth.permission': {'ops': 'all', 'timeout': 60*60},

    # Enable manual caching on all other models with default timeout of an hour
    # Use Post.objects.cache().get(...)
    #  or Tags.objects.filter(...).order_by(...).cache()
    # to cache particular ORM request.
    # Invalidation is still automatic
    '*.*': {'ops': (), 'timeout': 60*60},

    # And since ops is empty by default you can rewrite last line as:
    '*.*': {'timeout': 60*60},
}
kichik
  • 33,220
  • 7
  • 94
  • 114
  • Problem 1: That package requires Django 1.8+, and I would like to target Django 1.7. Problem 2: That package requires installing Redis, and I would rather not add a dependency. File-based caches on the local hard drive will be faster than hitting the database, which is non-local and sometimes under load. – SerMetAla Jul 05 '17 at 17:58
  • 1
    That's only for the next version. 3.2.1 still supports Django 1.7 -- https://pypi.python.org/pypi/django-cacheops – kichik Jul 05 '17 at 18:00
  • 1
    Redis is in-memory and you can install it locally. That should be much faster than reading files. – kichik Jul 05 '17 at 19:10