I run a crawler back-end in my application that everyday mines some websites for data.
For every website I crawl I create an entity, stores a big list of String id's.
- Approximation is around 2.000 per entity.
- Around 1000 entities per day.
The way I do it right now is with a repeated ndb.StringProperty() not indexed.
After 3 days of run it consumed 70% of my datastore storage.
What could be the next thing to do? Store them as compressed json?
Store them in the blobstore and everytime read the blob, etc?
Something else? An alternative?