1

I have deployed a 5-sharded infrastructure where: shard1 has 3124422 docs shard2 has 920414 docs shard3 has 602772 docs shard4 has 2083492 docs shard5 has 11915639 docs Indexes total size: 100GB

The OS is Linux x86_64 (Fedora release 8) with vMem equal to 7872420 and I run the server using Jetty (from Solr example download) with: java -Xmx3024M -Dsolr.solr.home=multicore -jar start.jar

The response time for a query is around 2-3 seconds. Nevertheless, if I execute several queries at the same time the performance goes down inmediately: 1 simultaneous query: 2516ms 2 simultaneous queries: 4250,4469 ms 3 simultaneous queries: 5781, 6219, 6219 ms 4 simultaneous queries: 6484, 7203, 7719, 7781 ms...

Using JConsole for monitoring the server java proccess I checked that Heap Memory and the CPU Usages don't reach the upper limits so the server shouldn't perform as overloaded. Can anyone give me an approach of how I should tune the instance for not being so hardly dependent of the number of simultaneous queries?

Thanks in advance

supersoft
  • 63
  • 3
  • 5
  • It is a bit unclear if you are refering to the startup cost, or if it is a consistent problem you are seeing – Cine Jan 07 '11 at 08:55
  • Is your multicore setup correct? http://stackoverflow.com/questions/2714046/tomcat-solr-multiple-cores-setup – Cine Jan 07 '11 at 09:09
  • That problem is consistent and it is becoming a great headache because the response times are totally dependent of the number of clients who are searching at the same time. The setup is correct considering I've been running batches for indexing and searching data. The problem is the performance of the query results... – supersoft Jan 07 '11 at 09:26
  • 1
    The numbers looks pretty consistent with a large part of the query running in single thread. Can you check iotop to see how much your disks are getting hammered? It might be a cache trashing issue (for which the solution would be more memory) – Cine Jan 07 '11 at 09:43
  • PID USER PR NI VIRT RES SHR S %CPU %MEM XXXXX root 18 0 3321m 3.1g 8388 S 65 40.7 – supersoft Jan 07 '11 at 09:51
  • That dump is not from iotop, it is from top... http://freshmeat.net/projects/iotop – Cine Jan 07 '11 at 09:56
  • iostat may be easier to make a dump from :) run "iostat 1" and then take the dump from a couple of seconds of queries – Cine Jan 07 '11 at 10:02
  • 1
    Is this a 'bump' (or rather duplicate) of http://stackoverflow.com/questions/4431620/simultaneous-queries-in-solr ? – Mauricio Scheffer Jan 07 '11 at 10:04
  • @cine iostats show this: Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sdg 68,96 2344,00 2267,60 6646238426 6429595824 @Mauricio: Yes, I continued with the same problem, the previous question was a bit confusing (now well structured) and I create a real account in stackoverflow for starting with this. I would like to erase the previous one but I can not :S – supersoft Jan 07 '11 at 11:00
  • Definitely something wrong with IO, you are seeing a high read AND write and tps (for a HDD). The numbers here are about what you would see for a single hdd, but I dont know what you have in it. It is a bit hard to see what the utilization is though, try adding -x to iostat. And try to increase the "java -Xmx3024M" to 7gig. What is the utilization when you are NOT running queries? If it is anything but near zero, then get iotop running and figure out what is doing the IO. – Cine Jan 07 '11 at 14:53
  • too few memory would be my first bet too, because solr will try to put things into cache (which is good for performance but 'bad' for memory). E.g. I'm having a 5 GB index with 2 GB RAM. Either add RAM or machines (create replicas for that) – Karussell Jan 07 '11 at 23:33

2 Answers2

2

You may want to consider creating slaves for each shard so that you can support more reads (See http://wiki.apache.org/solr/SolrReplication), however, the performance you're getting isn't very reasonable.

With the response times you're seeing, it feels like your disk must be the bottle neck. It might be cheaper for you just to load up each shard with enough memory to hold the full index (20GB each?). You could look at disk access using the 'sar' utility from the sysstat package. If you're consistently getting over 30% disk utilization on any platter while searches are ongoing, thats a good sign that you need to add some memory and let the OS cache the index.

Has it been awhile since you've run an optimize? Perhaps part of the long lookup times is a result of a heavily fragmented index spread all over the platter.

asm
  • 354
  • 2
  • 3
2

As I stated on the Solr mailinglist, where you asked same question 3 days ago, Solr/Lucene benefits tremendously from SSD's. While sharding on more machines or adding bootloads of RAM will work for I/O, the SSD option is comparatively cheap and extremely easy.

Buy an Intel X25 G2 ($409 at NewEgg for 160GB) or one of the new SandForce based SSD's. Put your existing 100GB of indexes on it and see what happens. That's half a days work, tops. If it bombs, scavenge the drive for your workstation. You'll be very happy with the performance boost it gives you.

  • Thanks for that idea. It is an interesing but this system is running in the cloud. However, I will have it in mind for future projects. – supersoft Jan 13 '11 at 17:03