1

I'm using Django and am getting an error that says "This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'".

My database structure is approximately as follows:

    class ImageTag(models.Model):
        start_image = models.ForeignKey(Image)
        stop_image = models.ForeignKey(Image)
        tag = models.TextField(null=True)

    class Image(models.Model):
        [things we don't really care about right now]

What I'm trying to do is get the unique Image objects (note: not values, I need the actual objects) that occur in the start_image field of all ImageTag objects. How I'm doing this:

    tag_match = [filtering all tags based on a query]
    ids = tag_match.values('start_image').distinct()  #get the distinct image ids
    images = Image.objects.filter(id__in=ids)   #get the actual image objects

This was the easiest and most straightforward way I could see to get the unique Image objects that have an ImageTag object associated with it, but my database backend doesn't support it. I've seen a few suggestions as to how to get around this using raw SQL, but I'd really like to avoid that if possible because the differences between our dev setups and our production setup (why this is a problem in the first place) make it really hard to know what actually will and won't work.

Some kind of Django workaround to this problem would be most appreciated.

solidparallel
  • 81
  • 1
  • 5
  • What version are you using? Also, what version of Python are you using? – jrd1 Feb 11 '14 at 17:59
  • Our server is running MySQL version 5.1.69. I'm significantly less certain about the python version, but am confident that it is at least 2.6. – solidparallel Feb 11 '14 at 18:02
  • OK. What about the _version_ of Django you're using? – jrd1 Feb 11 '14 at 18:03
  • Not knowing anything about any of those other technologies, could you simply construct a VIEW in mysql, and then select from that? – Strawberry Feb 11 '14 at 18:04

2 Answers2

0

Here is another way of getting distinct Image objects. Please try and let know if this works.

class ImageTag(models.Model):
    start_image = models.ForeignKey(Image, related_name="start_image_tags")
    stop_image = models.ForeignKey(Image)
    tag = models.TextField(null=True)

# Fetch only those Image objects which have a related ImageTag
images = Image.objects.exclude(start_image_tags=None).distinct()

# Fetch Image objects for filtered ImageTag query
tag_match = ImageTag.objects.filter(tag="any_filter")
images = Image.objects.filter(start_image_tags__in=tag_match).distinct()
Sunny Nanda
  • 2,362
  • 1
  • 16
  • 10
  • This works to get all of the distinct images, but is there a way to get the distinct images that are in my filtered "tag_match" QuerySet of ImageTag objects that match my search? – solidparallel Feb 11 '14 at 18:29
  • Added another query with filtered `ImageTag`. Please try the answer now. – Sunny Nanda Feb 11 '14 at 18:42
0

The problem comes from MySQL itself, not from Python nor Django - you'd get the same problem in PHP FWIW : MySQL - This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery

The workaround in Django is pretty straightforward: instead of a ValueQuerySet (which will result in a SQL subquery), pass a plain list,ie:

tag_match = [filtering all tags based on a query]
ids = list(tag_match.values_list('start_image', flat=True).distinct())  
images = Image.objects.filter(id__in=ids)  
Community
  • 1
  • 1
bruno desthuilliers
  • 75,974
  • 6
  • 88
  • 118
  • Unfortunately the list() operator makes this too slow to scale for the 500 million images that we have on our server. – solidparallel Feb 13 '14 at 17:57
  • What about using some _real_ SQL database then ? Like, say, PostgreSQL ? – bruno desthuilliers Feb 13 '14 at 19:31
  • I'm a student working in a lab, so I do not have control over what our server is using. I am aware that this would all be much simpler using PostgreSQL, but that is not a viable solution to my problem at this juncture. – solidparallel Feb 18 '14 at 17:33
  • Actually I was wrong in my time evaluation. The list() operator in this case will time scale only with the number of tags, which is considerably smaller than the number of images, which makes this answer work. Sorry for my confusion, and thank you for your answer! – solidparallel Feb 18 '14 at 17:45