8

I have a number of sites with PHP and MySQL, especially running MediaWiki, and I need to enhance the performance. However, I have only a limited percentage of CPU that I'm allowed to use.

The best thing I can think about to improve performance is to enable caching. However, I'm confused: Does that really enhance performance overall or just enhance speed?

What I can think about is, if caching will use files, then it would take more processing to get the content of these files. If it will use SQL tables, then it will take more processing to query these tables as well, perhaps the time will be shorter, but the CPU usage will be more.

Is that correct or not? does caching consume more CPU to give a speeder results or it improves performance overall?

Nemo
  • 2,441
  • 2
  • 29
  • 63
Tamer Shlash
  • 9,314
  • 5
  • 44
  • 82
  • 1
    Well what did your measurements show? – arkascha Sep 27 '12 at 09:21
  • "especially MediaWiki" implies yes to your Q, but only the right kind of caching. For example MW uses innodb by default so MyISAM caching does help one jot here. Read the MW caching pages. You can configure some file based caches which make a BIG difference in MW for guest (ie. most) visitors. – TerryE Sep 27 '12 at 20:20
  • You should probably go through https://www.mediawiki.org/wiki/Manual:Performance_tuning With MediaWiki, your main concern is to avoid wikitext parsing, which is slow and requires a lot of CPU. – Nemo Apr 06 '15 at 17:02

5 Answers5

4

At the most basic level caching should be used to store the result of CPU intensive processes. For example, if you have a server side image handler that creates an image on-the-fly (say a thumbnail and larger preview) then you don't want this operation to occur on every request - you'd want to run this process once and store the results; Then, every other request gets the saved result.

This is obviously a hugely over-simplified description of basic caching, and the use of an image is fine in this case as you don't have to worry about stale data i.e. how often will the actual image change? In your case, databases are hugely different. If you cache data then how can you guarantee that there won't be an instant mismatch between your real data and your cached data? Querying a database is not always a CPU intensive task also (granted you have to consider how the database is designed in terms of indexing, table size etc) but in most cases querying a well designed database is far more intensive on disk I/O than it is on CPU cycles.

First, you need to look at your database design and secondly your queries. For example are you normalizing your database correctly, are your queries trawling through huge amounts of data when you could just archive, are you joining tables on non-indexed fields, are your where clauses querying fields that could be indexed (IN is particulary bad in these cases).

I recommend you get hold of a query analyzer and spend some time optimizing your table structure and queries to find that bottle neck before looking into more drastic changes.

Paul Aldred-Bann
  • 5,840
  • 4
  • 36
  • 55
1

Reference : http://msdn.microsoft.com/en-us/library/ee817646.aspx

Performance : Caching techniques are commonly used to improve application performance by storing relevant data as close as possible to the data consumer, thus avoiding repetitive data creation, processing, and transportation. For example, storing data that does not change, such as a list of countries, in a cache can improve performance by minimizing data access operations and eliminating the need to recreate the same data for each request.

Scalability : The same data, business functionality, and user interface fragments are often required by many users and processes in an application. If this information is processed for each request, valuable resources are wasted recreating the same output. Instead, you can store the results in a cache and reuse them for each request. This improves the scalability of your application because as the user base increases, the demand for server resources for these tasks remains constant. For example, in a Web application the Web server is required to render the user interface for each user request. You can cache the rendered page in the ASP.NET output cache to be used for future requests, freeing resources to be used for other purposes.

Caching data can also help scale the resources of your database server. By storing frequently used data in a cache, fewer database requests are made, meaning that more users can be served.

Availability : Occasionally the services that provide information to your application may be unavailable. By storing that data in another place, your application may be able to survive system failures such as network latency, Web service problems, or hardware failures. For example, each time a user requests information from your data store, you can return the information and also cache the results, updating the cache on each request. If the data store then becomes unavailable, you can still service requests using the cached data until the data store comes back online.

  • 3
    With all due respect, isn't this just a copy-and-paste answer from here: http://books.google.co.uk/books?id=MEOmjpKLmqYC&pg=PA414&lpg=PA414&dq=%22Performance+:+Caching+techniques+are+commonly+used%22&source=bl&ots=nqFchRBGQH&sig=jdQfh6sIm17he94PhxlattcXeeM&hl=en&sa=X&ei=yR9kUO-jGefW0QWekoH4DQ&ved=0CB4Q6AEwAA#v=onepage&q=%22Performance%20%3A%20Caching%20techniques%20are%20commonly%20used%22&f=false - if you're going to do that, at least reference your source to give credit. – Paul Aldred-Bann Sep 27 '12 at 09:44
  • I want to provide best solution. So I do surf on net and provide best solution here. Isnt it make sense ? what you think ? – Amrish Prajapati Sep 27 '12 at 09:46
  • Yes, around 35% of the paste is relevant to answering the original question. – user989056 Sep 27 '12 at 09:47
  • Oh I agree on the relevance, my issue was the lack of source for this content. – Paul Aldred-Bann Sep 27 '12 at 09:49
1

"Enhance performance" sounds like some of the email I get...

There are two, interrelated things that happen here. One is "how long does it take to serve a given request?", and the other is "how many requests can I serve concurrently given my limited resources?". People tend to use either or both of those concepts when talking about performance.

Caching can help with both those things.

The most effective caching strategy uses resources outside your machines to cache your stuff - the most obvious examples are the user's browser, or a CDN. I'll assume you can't use a CDN, but by spending a bit of effort on setting the HTTP cache headers, you can reduce the number of requests to your server for static or sluggish resources quite dramatically.

For dynamic content - usually the web page you generate by querying your database - the next most effective caching strategy is to cache the HTML generated by (parts of) your page. For instance, if you have a "most popular items" box on your homepage, this will usually run a couple of moderately complex database queries, and then some "turn data to HTML" back-end code. If you can cache the HTML, you save both the database queries and the CPU effort of turning the data into HTML.

If that's not possible, you may be able to cache the result of some database queries. That helps in reducing the database load, and usually also reduces the load on your web server - the code required to run the database query and deal with the results is usually more onerous that retrieving the item from cache; because it's faster, it allows your request to be handled quicker, which frees up resources more quickly. This reduces the load on your servers for an individual request, and thus allows you to serve more concurrent requests.

Neville Kuyt
  • 29,247
  • 1
  • 37
  • 52
0

You need to profile your seem and find out where the bottle necking is happening. Cacheing is the best type of page load, its one that doesn't hit the server at all. You can build a very simple caching system that only reloads the information ever 15 minutes. So, if the page was cached in the last 15 minutes it gives them a pre-rendered page. The page loaded once, it creates a temp file. every 15 minutes you create a new on (if someone loads that page).

Caching only stores a file that the server has already done the work for. The work to create the file is already done and your simply storing it.

Case
  • 4,244
  • 5
  • 35
  • 53
0

You use the terms 'performance' and 'speed'. I'll assume 'performance' relates to CPU cycles on your web server and that 'speed' relates to the time it takes to serve the page to the user. You want to maximize web server 'performance' ( by lowering the total number of CPU cycles needed to serve pages ) whilst maximizing 'speed' ( lowering the time it takes to serve a web page ).

The good news for you is that Caching can improve both of these metrics at the same time. By caching content you create an output page that is stored in the cache and can be served repeatedly to users directly without having to re-execute PHP code that originally created this output page ( thus lowering CPU cycles ). Fetching a cached page from cache consumes less CPU cycles than re-executing PHP code.

Caching is particularly good for web pages that are generally the same for all users who request the page - for example in a wiki, and for pages that generally do not change all too often - again, a wiki.

user989056
  • 1,275
  • 2
  • 15
  • 33