1

I've got a website that shows photos that are always being added and people are seeing duplicates between pages on the home page (last added photos)

I'm not entirely sure how to approach this problem but this is basically whats happening:

  • Home page displays latest 20 photos [0:20]
  • User scrolls (meanwhile photos are being added to the db
  • User loads next page (through ajax)
  • Page displays photos [20:40]
  • User sees duplicate photos because the photos added to the top of the list pushed them down into the next page

What is the best way to solve this problem? I think I need to somehow cache the queryset on the users session maybe? I don't know much about caches really so a step-by-step explanation would be invaluable

here is the function that gets a new page of images:

def get_images_paginated(query, origins, page_num):
    args = None
    queryset = Image.objects.all().exclude(hidden=True).exclude(tags__isnull=True)
    per_page = 20
    page_num = int(page_num)
    if origins:
        origins = [Q(origin=origin) for origin in origins]
        args = reduce(operator.or_, origins)
        queryset = queryset.filter(args)        


    if query:
        images = watson.filter(queryset, query)
    else:
        images = watson.filter(queryset, query).order_by('-id')
    amount = images.count()
    images = images.prefetch_related('tags')[(per_page*page_num)-per_page:per_page*page_num]

    return images, amount

the view that uses the function:

def get_images_ajax(request):
    if not request.is_ajax():
        return render(request, 'home.html')

    query = request.POST.get('query')
    origins = request.POST.getlist('origin')
    page_num = request.POST.get('page')

    images, amount = get_images_paginated(query, origins, page_num)
    pages = int(math.ceil(amount / 20))
    if int(page_num) >= pages:
        last_page = True;
    else:
        last_page = False;
    context = {
        'images':images,
        'last_page':last_page,
    }

    return render(request, '_images.html', context)
davegri
  • 2,206
  • 2
  • 26
  • 45
  • your function is not a django view. it doesnot get HttpRequest and doesnot return HttpResponse, how is your ajax talking to this function? – doniyor Oct 08 '15 at 07:58
  • My mistake, thats not the view but just the main function that gets the images that is used by different views. – davegri Oct 08 '15 at 08:01
  • then please show that view – doniyor Oct 08 '15 at 08:02
  • The view is just a wrapper for this function, it's separated out for modulartiy – davegri Oct 08 '15 at 08:04
  • ok, no problem, I just wanted to show you the way of caching – doniyor Oct 08 '15 at 08:08
  • and you should use django's built-in pagination, that is much cleaner and recommended one. no need to reinvent the wheel – doniyor Oct 08 '15 at 08:15
  • Using django's pagination will cause prefetch_related('tags') to prefetch tags for the whole query which can be very slow. this way django adds a LIMIT to the SQL query – davegri Oct 08 '15 at 08:23

1 Answers1

2

One approach you could take is to send the oldest ID that the client currently has (i.e., the ID of the last item in the list currently) in the AJAX request, and then make sure you only query older IDs.

So get_images_paginated is modified as follows:

def get_images_paginated(query, origins, page_num, last_id=None):
    args = None
    queryset = Image.objects.all().exclude(hidden=True).exclude(tags__isnull=True)
    if last_id is not None:
        queryset = queryset.filter(id__lt=last_id)
    ...

You would need to send the last ID in your AJAX request, and pass this from your view function to get_images_paginated:

def get_images_ajax(request):
    if not request.is_ajax():
        return render(request, 'home.html')

    query = request.POST.get('query')
    origins = request.POST.getlist('origin')
    page_num = request.POST.get('page')
    # Get last ID. Note you probably need to do some type casting here.
    last_id = request.POST.get('last_id', None)

    images, amount = get_images_paginated(query, origins, page_num, last_id)
    ...

As @doniyor says you should use Django's built in pagination in conjunction with this logic.

solarissmoke
  • 30,039
  • 14
  • 71
  • 73