0

I run a crawler back-end in my application that everyday mines some websites for data.

For every website I crawl I create an entity, stores a big list of String id's.

  • Approximation is around 2.000 per entity.
  • Around 1000 entities per day.

The way I do it right now is with a repeated ndb.StringProperty() not indexed.

After 3 days of run it consumed 70% of my datastore storage.

What could be the next thing to do? Store them as compressed json?

Store them in the blobstore and everytime read the blob, etc?

Something else? An alternative?

Jimmy Kane
  • 16,223
  • 11
  • 86
  • 117

1 Answers1

0

Compressed and JSON did the trick for me. Closing this.

Jimmy Kane
  • 16,223
  • 11
  • 86
  • 117