4

I have been developing a Joomla site with version 2.5.11.Site will be under very high traffic.

My problem is about MySQL query performance. Database includes about 60000 rows in content table, and the query seen below (core com_content articles model query) execution time is about 6 seconds.Very slow.

SELECT a.id,
   a.title,
   a.alias,
   a.title_alias,
   a.introtext,
   a.checked_out,
   a.checked_out_time,
   a.catid,
   a.created,
   a.created_by,
   a.created_by_alias,
   CASE
     WHEN a.modified = 0 THEN a.created
     ELSE a.modified
   END as modified,
   a.modified_by,
   uam.name as modified_by_name,
   CASE
     WHEN a.publish_up = 0 THEN a.created
     ELSE a.publish_up
   END as publish_up,
   a.publish_down,
   a.images,
   a.urls,
   a.attribs,
   a.metadata,
   a.metakey,
   a.metadesc,
   a.access,
   a.hits,
   a.xreference,
   a.featured,
   LENGTH(a.fulltext) AS readmore,
   CASE
     WHEN badcats.id is not null THEN 0
     ELSE a.state
   END AS state,
   c.title AS category_title,
   c.path AS category_route,
   c.access AS category_access,
   c.alias AS category_alias,
   CASE
     WHEN a.created_by_alias > ' ' THEN a.created_by_alias
     ELSE ua.name
   END AS author,
   ua.email AS author_email,
   contact.id as contactid,
   parent.title as parent_title,
   parent.id as parent_id,
   parent.path as parent_route,
   parent.alias as parent_alias,
   ROUND(v.rating_sum / v.rating_count, 0) AS rating,
   v.rating_count as rating_count,
   c.published,
   CASE
     WHEN badcats.id is null THEN c.published
     ELSE 0
   END AS parents_published
  FROM #__content AS a
 LEFT JOIN #__content_frontpage AS fp ON fp.content_id = a.id
 LEFT JOIN #__categories AS c ON c.id = a.catid
 LEFT JOIN #__users AS ua ON ua.id = a.created_by
 LEFT JOIN #__users AS uam ON uam.id = a.modified_by
 LEFT JOIN 
 (SELECT contact.user_id, MAX(contact.id) AS id, contact.language FROM
  #__contact_details AS contact WHERE contact.published = 1 GROUP BY
   contact.user_id, contact.language) AS contact ON contact.user_id =
    a.created_by
 LEFT JOIN #__categories as parent ON parent.id = c.parent_id
 LEFT JOIN #__content_rating AS v ON a.id = v.content_id
 LEFT OUTER JOIN 
 (SELECT cat.id as id FROM #__categories AS cat JOIN #__categories AS
  parent ON cat.lft BETWEEN parent.lft AND parent.rgt WHERE parent.extension
   = 'com_content' AND parent.published != 1 GROUP BY cat.id) AS badcats ON
    badcats.id = c.id
  WHERE CASE
  WHEN badcats.id is null THEN a.state
  ELSE 0
 END = 1 AND
  a.featured = 0 AND
  a.id NOT IN (8921, 33722, 33728, 33729, 34187, 35047, 36784, 36236, 33724,
   19522) AND
  a.catid IN (8, 39, 40, 38, 72, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
   20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 36, 37, 41) AND
  (a.publish_up = '0000-00-00 00:00:00' OR
  a.publish_up <= '2013-06-12 06:44:44') AND
  (a.publish_down = '0000-00-00 00:00:00' OR
  a.publish_down >= '2013-06-12 06:44:44')
    GROUP BY a.id,
     a.title,
     a.alias,
     a.title_alias,
     a.introtext,
     a.checked_out,
     a.checked_out_time,
     a.catid,
     a.created,
     a.created_by,
     a.created_by_alias,
     a.created,
     a.modified,
     a.modified_by,
     uam.name,
     a.publish_up,
     a.attribs,
     a.metadata,
     a.metakey,
     a.metadesc,
     a.access,
     a.hits,
     a.xreference,
     a.featured,
     a.fulltext,
     a.state,
     a.publish_down,
     badcats.id,
     c.title,
     c.path,
     c.access,
     c.alias,
     uam.id,
     ua.name,
     ua.email,
     contact.id,
     parent.title,
     parent.id,
     parent.path,
     parent.alias,
     v.rating_sum,
     v.rating_count,
     c.published,
     c.lft,
     a.ordering,
     parent.lft,
     fp.ordering,
     c.id,
     a.images,
     a.urls
 ORDER BY publish_up DESC
 LIMIT 4,  4

For example when I changed

 (a.publish_up = '0000-00-00 00:00:00' OR
  a.publish_up <= '2013-06-12 06:44:44') AND
  (a.publish_down = '0000-00-00 00:00:00' OR
  a.publish_down >= '2013-06-12 06:44:44')

with

(a.publish_up >= DATE_SUB(NOW(), INTERVAL 2 MONTH))

Query execution time is about ~0.5second

Query explain :

enter image description here

What can I do without hacking the core code,I never want to hack the core code...

MySQL server Config:

skip-external-locking
skip-name-resolve
key_buffer_size = 128M
max_connections=1024
max_allowed_packet = 16M
net_buffer_length = 8K
table_open_cache = 512
table_cache = 2048
join_buffer_size = 128M
read_buffer_size = 128M
sort_buffer_size = 128M
read_rnd_buffer_size=2M
myisam_sort_buffer_size=128M
thread_concurrency=16
query_cache_size = 128M  
query_cache_type=1
thread_cache_size = 1300
query_cache_limit=128M
max_heap_table_size = 1024M
tmp_table_size = 1024M
Umut KIRGÖZ
  • 2,105
  • 3
  • 22
  • 29
  • you could always set a `LIMIT` in your query and use pagination – Lodder Jun 12 '13 at 12:47
  • @Lodder, limit is over there, query only returns 4 rows – Umut KIRGÖZ Jun 12 '13 at 13:44
  • I'd go for forking the component. There's no such event like onBeforeQuery or a way to overload the buildQuery method in core models. Having said that, might be a better idea checking what's been done in 3.1 as they could've optimized it. Another option is a pull request to Joomla team – WooDzu Jun 12 '13 at 19:32
  • 1
    You may also want to try adding an index on the `publish_up` and `publish_down` columns in the database. I don't believe those exist by default and should help speed up the where statement. That may make the original query's speed more acceptable. – David Fritsch Jun 12 '13 at 19:34
  • I tried to add those indexes but it doesnot affect the result too much – Umut KIRGÖZ Jun 13 '13 at 07:41

3 Answers3

4

The best way I've found to improve that very slow query it to remove the references to archive (which defeat the entire purpose of the archive status which is to remove those items from list results). Just take items that are published (or published + unpublished) so eiher state=1 or state IN(0,1) You could look at possibly using http://extensions.joomla.org/extensions/style-a-design/templating/15611

Switching to InnoDB will possibly help also but having the most recent possible version of MySQL is the biggest way to get improvements.

About "what can you do now" I would love to see a pull request from you to add date limitations to that query. We have recently added data math to JDatabaseQuery so i is relatively simple to do and would be a really nice improvement. There are also some other pending issues that have big performance impacts and if you test and comment on them they will move forward into the code base for the next release. For example: https://github.com/joomla/joomla-cms/pull/1274

Elin
  • 6,507
  • 3
  • 25
  • 47
0

Do you really have to do a GROUP BY on all those columns? Would a simple GROUP BY a.id suffice?

If that doesn't have an effect, try increasing the values of the variables tmp_table_size and max_heap_table_size. This might be why you're getting a "Using Temporary" in the "Extra column". This message is because MySQL ran out of physical memory while doing the sort and needs to write the data to disk first, which is slower.

Find out what these values are:
SELECT @@tmp_table_size, @@max_heap_table_size;

And then set them to something higher, until the "Using Temporary" disappears. (hopefully you have enough memory to work with):
SET tmp_table_size = ###, @@max_heap_table_size = ###;

There's more information on this page: http://dev.mysql.com/doc/refman/5.1/en/internal-temporary-tables.html

If the above doesn't work, you can try throwing each of your derived queries into a temporary table with indexes on each of the columns you're joining on, and join on each of the new temp tables. But this can make it slower, especially if you have to do it during each request. Otherwise if the data isn't updated, then it could work.

P.S. I just realized your comment "What can I do without hacking the core code". Hopefully the session variables work.

Carlos
  • 675
  • 5
  • 7
  • The GROUP BY is for multi-database compatibility in the code. The Joomla CMS supports MySQL, PostgreSQL, and SQL Server, so there's some places that query optimization has suffered some for that. This is one of those queries. – Michael Jun 12 '13 at 23:30
  • tHanks for explaination @Michael, I was trying to understand the reason of those group by – Umut KIRGÖZ Jun 13 '13 at 07:46
  • I worked with SQL server once about 10 years ago and remember that too. It was an annoyance for me to list out every field. I thought SQL Server didn't support LIMIT, or has this changed in recent versions? Or did you add it? – Carlos Jun 13 '13 at 11:50
  • If I remember correctly, it doesn't support LIMIT the same as MySQL does, but there is a way to implement it. I'd have to look at our code again to see how it is done. – Michael Jun 13 '13 at 13:04
0

Since you'd have to hack the core to make changes to the query, your best bet would be to add a new model/view to com_content. The addition would not be overwritten by updates to the core.

Brent Friar
  • 10,588
  • 2
  • 20
  • 31