0

My model is like this:

class ArticleText(models.Model):
    article = models.OneToOneField(Article, on_delete=models.CASCADE, related_name="article_text")
    text = models.TextField()
    indexed_by_es = models.BooleanField(default=False, db_index=True)
    indexed_by_solr = models.BooleanField(default=False, db_index=True)

Article is an original model, and I want to use ArticleText to extend it.

And the time consuming code is this:

articles = Article.objects.filter(Q(article_text=None))[0:10]

There are about 10,000 articles in my database. How can I make this query faster?

Dillion Wang
  • 362
  • 3
  • 18
  • How do you check whether it is `None`? If you do this in Django, itself, then each time you will *make* a request. Check `article_id` instead. Anyway, please provide *how* you check it right now (or give some context what you aim to do). – Willem Van Onsem Sep 02 '18 at 13:35
  • Sorry... ... I accidentally posted the question before I finished it... I will repost it later QAQ – Dillion Wang Sep 02 '18 at 13:38
  • What query does this generate? Can you print the result of `str(Article.objects.filter(Q(article_text=None))[0:10].query)`. – Willem Van Onsem Sep 02 '18 at 13:55
  • It generates this: `SELECT main_article.id, main_article.section_id, main_article.title, main_article.publish_time, main_article.image1_url, main_article.image2_url, main_article.image3_url, main_article.content FROM main_article LEFT OUTER JOIN external_data_access_articletext ON (main_article.id = external_data_access_articletext.article_id) WHERE external_data_access_articletext.id IS NULL LIMIT 10`. (`main` and `external_data_access` are both Django apps) – Dillion Wang Sep 02 '18 at 14:30
  • Seems like the outer join is the cause... – Dillion Wang Sep 02 '18 at 14:34

1 Answers1

0

Updated

Thanks for @Sachin Kukreja.

Use articles = Article.objects.filter(article_text=None).values('id')[0:10] can prevent the raw SQL to select too many useless fields.It does what my previous answer does, but in a neat way.


Previous Answer

Thanks for @Willem Van Onsem's comment, I notice where this issue comes from: the query generated by Django selects too many useless fields. So I build a raw query myself:

articles = Article.objects.raw(''' select a.id from main_article as a left outer join external_data_access_articletext as b on a.id = b.article_id where (b.id is null) or (b.indexed_by_es = false) limit 10''')

It is faster. Think this might be a bug for Django.

Dillion Wang
  • 362
  • 3
  • 18