1

Suppose I have the following GQL database,

class Signatories(db.Model):
    name = db.StringProperty()
    event = db.StringProperty()

This database holds information regarding events that people have signed up for. Say I have the following entries in the database in the format (event_name, event_desc): (Bob, TestEvent), (Bob, TestEvent2), (Fred, TestEvent), (John, TestEvent). But the dilemma here is that I cannot just aggregate all of Bob's events into one entity because I'd like to Query for all the people signed up for a specific event and also I'd like to add such entries without having to manually update the entry every single time.

How could I count the number of distinct strings given by a GQL Query in Python (in my example, I am specifically trying to see how many people are currently signed up for events)?

I have tried using the old mcount = db.GqlQuery("SELECT name FROM Signatories").count(), however this of course returns the total number of strings in the list, regardless of the uniqueness of each string.

I have also tried using count = len(member), where member = db.GqlQuery("SELECT name FROM Signatories"), but unfortunately, this only returns an error.

Diafotismos
  • 109
  • 1
  • 10

1 Answers1

1

You can't - at least not directly. (By the way you don't have a GQL database).

If you have a small number of items, then fetch them into memory, and use a set operation to produce the unique set and then count

If you have larger numbers of entities that make in memory filtering and counting problematic then your strategy will be to aggregate the count as you create them,

e.g.

create a separate entity each time you create an event that has the pair of strings as the key. This way you will only have one entity the data store representing the specific pair. Then you can do a straight count.

However as you get large numbers of these entities you will need to start performing some additional work to count them as the single query.count() will become too expensive. You then need to start looking at counting strategies using the datastore.

Tim Hoffman
  • 12,976
  • 1
  • 17
  • 29
  • Oh, I used the wrong example on accident, so I may have mislead you. I wouldn't be able to make these aggregated lists (or could I?). Would it be possible to convert what the Query spits out into a list and then use len() with that (or is that not possible)? In any case, thanks for your help. – Diafotismos Feb 13 '16 at 03:02
  • Sure you can, do a project query and only return the two fields your interested in. then use list comprehension to turn the result set in the set of tuples, and then you will have a set of unique values which you can use len() on. – Tim Hoffman Feb 13 '16 at 04:54
  • `len(set(i.name for i in db.GqlQuery("SELECT name FROM Signatories")))` – Tim Hoffman Feb 13 '16 at 04:57
  • I thought you meant distinct pairs of strings. – Tim Hoffman Feb 13 '16 at 04:59
  • But the simplistic approach like this won't scale and then you still have think about how accurate you need your count, and then look at different performant counting strategies. – Tim Hoffman Feb 13 '16 at 05:00
  • Ah, it works! My website will really only be serving a very small audience so your solution works out well for me. Thank you for your help! – Diafotismos Feb 13 '16 at 17:28