0

I have a large complex Sphinx configuration running on a large database. Because it was taking a long time to rotate, we set it up to multi-thread using mutliple cores/cpus. This cut down the time neede to rotate significantly naturally. However the question is do we now need to serve that index up in the same manner? If we rotate it with say 32 cpu server do we then need to have a 32 cpu server to serve the index or can we somehow reincorporate the shards into one when done?

user3649739
  • 1,829
  • 2
  • 18
  • 28

1 Answers1

0

Well there is a Merge option to combine two indexes into one. http://sphinxsearch.com/docs/current.html#index-merging

But what you mean by the separation between 'rotate' and 'serve'?

Both are done by searchd. Rotate is loading in a new index, serve is answering queries. At least to my understanding.

... So can just query these indexes, directly, assuming you was able to rotate them.

If can provide more details, perhaps can give more detailed answer.

barryhunter
  • 20,886
  • 3
  • 30
  • 43
  • Well I hope the following clarifies (I didn't do the actual sharding just the config); the plan is to spin up a 64-cpu server to index the configuration in a multi-threaded manner. As far as I know it literally splits the db into discrete chunks but I could be wrong. This is to take an index that can take a day into a reasonable time frame to rotate. My hope was to be able to run the site with the sphinx query (sphinxQL) from a smaller server as as not to need the cost of a 64-CPU Server 24x7. – user3649739 Feb 23 '17 at 00:41
  • Well sphinx isnt limited to one index per core. It could query a 64-shard index, quite happily on a single core. It will just be done serially, one by one. If have 2 cores, in theory would be twice as fast, as could go two at a time. 4 cores twice again. More cores = more parallelization, so less query wall time. Whats more important, cost or performance? :) – barryhunter Feb 23 '17 at 10:02
  • Ah the Eternal Question :| The goal is to launch with the biggest server we can afford to get as much performance as possible. Naturally we'd like to grow into as large a one to get the best performance possible. So the question basically could be restated as "until we can afford a server that has as many cores as we have shards will sphinx be able to run on less cores?' I think you answered that one just fine though thank you! – user3649739 Feb 23 '17 at 13:55
  • A quick serialization question for you then; to my mind our goal might also be doing the *least* amount of shards that will index in a reasonable time. Say for arguments sake we run live on an 8-core server. If we've sharded on a 64-core machine we then have 8 pipes to serialize whereas if we can reasonably index on a 16-core machine with 16 shards we'd only have two pipes to serialize? Is this a fair summary of the issue/trade-off? – user3649739 Feb 23 '17 at 14:00
  • Doesnt matter how many cores the original machine that did indexing has/had. Again could index 64 shards on a single core machine. Just a 64 core machine could do it quicker. The improtant bit is *how many shards you have* to query (ie how many **indexes** are specified in a query). Indexing and querying are completely decoupled as regards to indexes/cores ratio. – barryhunter Feb 23 '17 at 14:23