2

I noticed something strange with the total_entries count for my results.

When indexing my document, I see that there are 8027 documents indexed :

using config file 'myapp/config/production.sphinx.conf'...
indexing index 'variant_nl_core'...
collected 8027 docs, 2.0 MB
collected 16124 attr values
sorted 0.0 Mvalues, 100.0% done
sorted 7.4 Mhits, 100.0% done
total 8027 docs, 2007375 bytes
total 15.138 sec, 132600 bytes/sec, 530.23 docs/sec
indexing index 'variant_nl_delta'...
collected 0 docs, 0.0 MB
collected 0 attr values
sorted 0.0 Mvalues, 100.0% done
total 0 docs, 0 bytes
total 0.010 sec, 0 bytes/sec, 0.00 docs/sec
skipping non-plain index 'variant_nl'...
indexing index 'variant_fr_core'...
collected 8027 docs, 2.0 MB
collected 16124 attr values
sorted 0.0 Mvalues, 100.0% done
sorted 6.6 Mhits, 100.0% done
total 8027 docs, 2048826 bytes
total 16.959 sec, 120808 bytes/sec, 473.31 docs/sec
indexing index 'variant_fr_delta'...
collected 0 docs, 0.0 MB
collected 0 attr values
sorted 0.0 Mvalues, 100.0% done
total 0 docs, 0 bytes
total 0.013 sec, 0 bytes/sec, 0.00 docs/sec
skipping non-plain index 'variant_fr'...
total 64311 reads, 0.045 sec, 1.2 kb/call avg, 0.0 msec/call avg
total 209 writes, 0.097 sec, 789.4 kb/call avg, 0.4 msec/call avg

When I do a search with nil as query, I'd expect to have all 8027 document matching the search.

r = Variant.search nil

But when I check the number of matching entries with total_entries, I actually get more results :

r.total_entries
 => 15054 

How is this possible ? What I am missing ?

UPDATE 23/09/2015

As suggested by Eugene, multiple indices are the cause of my issue :

'total_entries" counts the number of documents found in all indeces (_core and _delta).

Now, I would need a way to know how many instance of my model ('Variant') corresponds to the sphinx documents.

LapinLove404
  • 1,939
  • 1
  • 21
  • 26
  • Is there any `joins` used in the `.search` method of the Variant model? – MrYoshiji Sep 21 '15 at 13:35
  • Also, which version of Thinking Sphinx are you using? – pat Sep 22 '15 at 06:27
  • .search method is the search method from Thinking Sphinx. http://www.rubydoc.info/github/pat/thinking-sphinx/ThinkingSphinx#search-class_method – LapinLove404 Sep 22 '15 at 11:22
  • I am on thinking-sphinx (2.0.13) – LapinLove404 Sep 22 '15 at 11:23
  • I'm not sure what the reason is, but it's worth noting that 2.0.13 is more than three years old. A *lot* has changed in TS since then - although granted, there's a bit involved to upgrade to v3, which is covered in the docs: http://freelancing-gods.com/thinking-sphinx/upgrading.html – pat Sep 22 '15 at 18:51
  • I do use v3 in recent projects. This project, however, uses v2 and upgrading is not really an option for the moment. – LapinLove404 Sep 23 '15 at 10:49

1 Answers1

1

As I see from your index log, you have 2 indexes: 'variant_nl' and 'variant_fr', each index contains 8027 documents. So in total you have 15054 documents.

Eugene Soldatov
  • 9,755
  • 2
  • 35
  • 43
  • I tough of that but since 8027 * 2 = 16054, I tough there was something else than that. However, I did another test with 'r = Variant.search nil, index: "variant_nl_core"' and then I do get 8027 as 'total_entries' Multiple indeces for the same model are indeed the cause of the total count... Now, I'd need to find a way to count the total of 8027 'Variant' under these 15054 search results. – LapinLove404 Sep 23 '15 at 10:58
  • Sorry for my inattention, really two indexes sum isn't 15045 – Eugene Soldatov Sep 23 '15 at 11:00
  • It doesn't matter, multiple indeces are the cause. You are right about this. (not sure why the counter is 1000 less than the actuel sum, however) When entries are modified and thus enters the delta index, the counter goes up as well :-D – LapinLove404 Sep 23 '15 at 11:01