0

Let's say we have baskets of fruits. How to filter out those baskets that contains all fruits in a given basket?

In this document https://docs.djangoproject.com/en/dev/ref/models/querysets/#in 'in' method seems will return any basket which contains any of the given fruit. Is there any "set_contains" method to be used for filtering ? Such as Basket.objects.filter(fruit_set_containsAll=fruitSet) ?

Edit: I found that Q() is not working as expected. I'll post the test here:

class Basket(models.Model):
    weight = models.FloatField(default=1)
class Fruitname(models.Model):
    name = models.CharField(max_length=32)
class Fruit(models.Model):
    ofkind = models.ForeignKey(Fruitname, on_delete=models.CASCADE)  
    inbasket = models.ForeignKey(Basket, on_delete=models.CASCADE)  
    weight = models.FloatField(default=1)

the database is set as 'apple' only in 1 basket, 'pear' in 2 baskets, and 'banana' in 3 basket:

print Basket.objects.all()
print Fruitname.objects.all()
[<Basket: id 31 has 4 fruits 
    1. id - 53 apple(28), weight 1.00
    2. id - 54 apple(28), weight 2.00
    3. id - 55 apple(28), weight 3.00
    4. id - 62 banana(30), weight 10.00
>, <Basket: id 32 has 2 fruits 
    1. id - 56 pear(29), weight 4.00
    2. id - 57 banana(30), weight 5.00
>, <Basket: id 33 has 4 fruits 
    1. id - 58 pear(29), weight 6.00
    2. id - 59 banana(30), weight 7.00
    3. id - 60 pear(29), weight 8.00
    4. id - 61 pear(29), weight 9.00
>]
[<Fruitname: apple(28)>, <Fruitname: pear(29)>, <Fruitname: banana(30)>]

If I try to query with 'apple' and 'banana', it gives empty set !!

print Basket.objects.filter(Q(fruit__ofkind__name__in=['apple'])&Q(fruit__ofkind__name__in=['banana'])).distinct()
[]

similarly,

print Basket.objects.filter(Q(fruit__ofkind__name__in=['banana'])&Q(fruit__ofkind__name__in=['pear'])).distinct()
[]

Here's how I use 'in' to filter, it is not what I need which is supposed to be an empty set.

print Basket.objects.filter(Q(fruit__ofkind__name__in=['apple','pear'])).distinct()
[<Basket: id 31 has 4 fruits 
    1. id - 53 apple(28), weight 1.00
    2. id - 54 apple(28), weight 2.00
    3. id - 55 apple(28), weight 3.00
    4. id - 62 banana(30), weight 10.00
>, <Basket: id 32 has 2 fruits 
    1. id - 56 pear(29), weight 4.00
    2. id - 57 banana(30), weight 5.00
>, <Basket: id 33 has 4 fruits 
    1. id - 58 pear(29), weight 6.00
    2. id - 59 banana(30), weight 7.00
    3. id - 60 pear(29), weight 8.00
    4. id - 61 pear(29), weight 9.00
>]

The only way that is working properly is chaining with filter:

Basket.objects.filter(fruit__ofkind__name__in=['apple']).filter(fruit__ofkind__name__in=['banana']).distinct()
[<Basket: id 31 has 4 fruits 
    1. id - 53 apple(28), weight 1.00
    2. id - 54 apple(28), weight 2.00
    3. id - 55 apple(28), weight 3.00
    4. id - 62 banana(30), weight 10.00
>]

These code is tested with Django 1.9.4 Any explaining? I would undo the accepted answer for the moment.

wanyancan
  • 370
  • 3
  • 8
  • by contains all the given fruits, do you refer to the name ? I mean, what do you have, fruit instances or names? – levi Mar 30 '16 at 04:57
  • Just a side note that if you are using postgres db there is an array field that has the operation you are describing called `contains`. Probably won't work for you as you seem to need an array of objects, but for an array of primitive types that would be a good option https://docs.djangoproject.com/en/1.9/ref/contrib/postgres/fields/#contains – serg Mar 30 '16 at 05:42
  • @levi Let's say given a basket, find all the other baskets containing at least the same kind of fruits. So it's like that basket.fruit_set.all() – wanyancan Mar 30 '16 at 10:58
  • @serg It seems that operation is specific to a database implementation ? I guess the performance won't have much difference in how to write the code in Django. It's up to the database then. – wanyancan Mar 30 '16 at 11:00

3 Answers3

1

Not sure if this is the most performant way to go, but at least it should translate to one (big, ugly, nested) sql transaction (once the final baskets queryset is evaluated):

baskets = Basket.objects.all()
for fruit in fruits:
    baskets = baskets.filter(id__in=fruit.basket.all())

A more elegant (and possibly more performant) way could be tried as follows composing a query using Q objects (building on Dave Webb's answer to another question):

queries = [Q(id__in=fruit.basket.all()) for fruit in fruits]

query = Q()

# AND the Q object with the ones in the list
for item in queries:
    query &= item

baskets = Basket.objects.filter(query)
Community
  • 1
  • 1
user2390182
  • 72,016
  • 6
  • 67
  • 89
1

By printing the raw SQL of the Q & operations, I've found the reason why Q is working as this.

from django.db import connection
print connection.queries
u'SELECT DISTINCT "market_basket"."id", "market_basket"."weight" FROM "market_basket" INNER JOIN "market_fruit" ON ("market_basket"."id" = "market_fruit"."inbasket_id") INNER JOIN "market_fruitname" ON ("market_fruit"."ofkind_id" = "market_fruitname"."id") WHERE ("market_fruitname"."name" IN (\'apple\') AND "market_fruitname"."name" IN (\'banana\')) LIMIT 21'

the key problem is WHERE clause will not be satisfied in a single condition when queries is used in a single filter. It is actually looking for a fruitname is both in ['apple'] and ['banana'] which is not possible. What is needed is to find (those fruits that have fruitname of 'apple') or (those fruits that have fruitname of 'banana')

Currently the only viable solution is to chain filters.

wanyancan
  • 370
  • 3
  • 8
0

To get a Basket with all available Fruit instances in it, you can do something like this:

from django.db.models import Count

# first get all PKs of fruits
fruit_pk_list = Fruit.objects.value_list('id', flat=True)

# Then get filter the basket with all fruits using annotate and Count
baskets = Basket.objects.annotate(
    num_fruit=Count('fruit')).filter(num_fruit=len(fruit_pk_list))
v1k45
  • 8,070
  • 2
  • 30
  • 38
  • Not only have same number of fruits, but also has all the specified fruits with given ids. – wanyancan Mar 30 '16 at 10:15
  • It filters baskets with number of fruits equal to all fruits in model. Have to tried running it? – v1k45 Mar 30 '16 at 10:18
  • I think there's some misunderstanding to my question. I mean given a subset of all fruits, find a basket containing these fruits in the subset. I do not mean all the fruits as in Fruit.objects.all() – wanyancan Mar 30 '16 at 10:47
  • add one more filter to `Basket` with `id__in` filter.. something like `.filter(id__in=fruit_pk_list)` this will first find baskets which match the number of fruits then check if the fruits match the supplied ids. PS: `fruit_pk_list` here is list of fruit pk of your choice and filter, not all. – v1k45 Mar 30 '16 at 10:56
  • I am aware of id__in. But it will return all the baskets that containing any given id in the fruit_pk_list. I want baskets containing all the given fruit ids. – wanyancan Mar 30 '16 at 11:02
  • Not with `Count()`. The `num_fruit` ensures that the baskets have exact number of fruits you specified to be. It will only return baskets will __ALL__ fruits in it. – v1k45 Mar 30 '16 at 11:06
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/107726/discussion-between-wanyancan-and-v1k45). – wanyancan Mar 30 '16 at 11:17
  • See my new test result. 'in' is not what I want. – wanyancan Mar 31 '16 at 02:08