2

currently I'm trying to port existing Google App Engine application from webapp2 to django using the djangoappengine.

Are there a equivalent in memory space saving ways to store the data using Django? Because there are limits to the amount stored in GAE for free user.

webapp2 model code

class TagTrend_refine(ndb.Model):
    tag = ndb.StringProperty()
    trendData = ndb.BlobProperty(compressed=True)

I know that TextField can store large amount of text, but can it store using lesser memory? Is using BlobField possible?

An example of data being store for trendData (as many as 24783 characters) is

{"2008": "{\"nodes\": [{\"group\": 0, \"name\": \"ef-code-first\", \"degree\": 6}, {\"group\": 1, \"name\": \"gridview\", \"degree\": 6}, {\"group\": 2, \"name\": \"mvvm\", \"degree\": 6}, {\"group\": 1, \"name\": \"webforms\", \"degree\": 6}, {\"group\": 2, \"name\": \"binding\", \"degree\": 6}, {\"group\": 3, \"name\": \"web-services\", \"degree\": 6}, {\"group\": 2, \"name\": \"datagrid\", \"degree\": 6},...
LeonBrain
  • 347
  • 1
  • 3
  • 15

1 Answers1

6

Django itself doesn't natively have a way to store data compressed, however you could use the zlib module to compress data before saving it to the database.

Here's a sample implementation of such a field in Django:

class CompressedTextField(models.TextField):

    def __init__(self, compress_level=6, *args, **kwargs):
        self.compress_level = compress_level
        super(CompressedTextField, self).__init__(*args, **kwargs)

    def to_python(self, value):
        value = super(CompressedTextField, self).to_python(value)
        return zlib.compress(value.encode(), self.compress_level)

    def get_prep_value(self, value):
        value = super(CompressedTextField, self).get_prep_value(value)
        return zlib.decompress(value).decode()

This field has an extra parameter compared to a regular TextField:

class TagTrend(models.Model):

    tag = models.CharField(max_length=1024)

    # zlib offers compression levels 0-9
    #    0 is no compression
    #    9 is maximum compression
    trendData = CompressedTextField(compress_level=9)

As an example, storing the string 'a' * 1024 (which is 1024 bytes) when compressed is only 17 bytes.

Do note that the limitation of using such a field is that the data is stored compressed. This means your database queries will search/filter using the compressed version.

Derek Kwok
  • 12,768
  • 6
  • 38
  • 58
  • Hi, are the functions to_python and get_prep_value are overwriting the base function? So I could basically just save the object as per normal like, t = TagTrend(tag='stuff',trendData='fafa'). And the compress will be handled by the overwritten functions? – LeonBrain Feb 29 '16 at 16:24
  • Do you know why there is a "Error -3 while decompressing data: incorrect header check" when i use bulk_create to save a list of objects? – LeonBrain Feb 29 '16 at 16:45
  • 1
    @LeonBrain Yes the `to_python` and `get_prep_value` functions are overwriting their parent versions. You can save the objects normally, and compression happens automatically. I just tested the field again and haven't gotten any errors when I `bulk_create`. Could you post a code snippet of what's failing for you? – Derek Kwok Feb 29 '16 at 17:26