0

I have added advertisements to my website which have quite some conditions to meet before delivering to a browsing user. Here's a detailed explanation:

These are the fields that require explaining:

start is by default '0000-00-00' and it indicates whether the ad has been yet paid or not. When an ad payment is accepted start is set to the day after, or any date the customer choses.

impresssions is respectively the remaining impressions of the advertisement

impressions_total and impressions_perday are self explanatory

and the other fields used in the query are just fields that validate whether the user falls into the specifications of the advertisement's auditory

An advertisement has to be paid to start displaying in the first place, however it can be set to start on a future date so the start value will be set but the ad shouldn't show up before it is time to. Then since customers can limit impressions per day I need to pick up only advertisements that have enough impressions for the day in progress. For example if an advertisement is started in 30/08/2013 with 10,000 impressions and 2,000 impressions per day then it shouldn't be able to show up today (31/08/2013) if it has less than 6,000 impressions because it's the second day of the campaign. As well as if the term period is say 5 days, and 5 days have passed, the advertisement has to be shown regardless of remaining impressions. Then there are those other comparisons to validate that the user is fit for this ad to display and the whole thing gets so complicated.

I am not quite good with mysql, although I have managed to construct a working query I am very concerned about optimizing it. I am most certain that the methods I have used are highly inefficient but I couldn't find a better way online. That's why I'm asking this question here, if anyone can help me improve the performance of this query?

SELECT `fields`,
FROM `ads` 
WHERE (`impressions`>0 && `start`!='0000-00-00')
AND `start`<CURDATE() AND 
    (
        `impressions`>(`impressions_total`-(CONVERT(CURDATE()-date(`start`), UNSIGNED)*`impressions_perday`)) 
        OR (`impressions_total`/`impressions_perday` < CURDATE()-date(`start`))

        -- this is the part where I validate the impressions for the day
        -- and am most concerned that I haven't built correctly
    )
AND
(
    (
        (YEAR(NOW())-YEAR("user's birthday") BETWEEN `ageMIN` AND `ageMax`) 
        AND (`sex`=2 OR `sex`="user's gender")
        AND (`country`='' OR `country`="user's country")
    ) OR `applyToUnregistered` = 1
)
ORDER BY $random_order -- Generate random order pattern

Schema:

CREATE TABLE `ads` (  
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `headline` varchar(25) NOT NULL,
  `text` varchar(90) NOT NULL,
  `url` varchar(50) NOT NULL,
  `country` varchar(2) DEFAULT '0',
  `ageMIN` tinyint(2) unsigned NOT NULL,
  `ageMax` tinyint(2) unsigned NOT NULL,
  `sex` tinyint(1) unsigned NOT NULL DEFAULT '2',
  `applyToUnregistered` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `creator` int(10) unsigned NOT NULL,
  `created` int(10) unsigned NOT NULL,
  `start` date NOT NULL,
  `impressions_total` int(10) unsigned NOT NULL,
  `impressions_perday` mediumint(8) unsigned NOT NULL,
  `impressions` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=27 DEFAULT CHARSET=utf8
php_nub_qq
  • 15,199
  • 21
  • 74
  • 144
  • 2
    **Don't quote/backtick your column names.** They only add visual clutter and are just one more way for you to make syntax errors. The only reason you need them is if you have a column name that is a reserved word, and using column names that are reserved words is a terrible idea, so that's two bad habits you can avoid at once. – Andy Lester Aug 31 '13 at 14:31
  • Optimization is difficult without the schema. Please post the create table syntax. – zevra0 Aug 31 '13 at 14:51

1 Answers1

1

You have a very complicated query from an optimization perspective.

The only indexes that can be used on the where clause are on ads(impressions) or ads(start). Because you use inequalities, you cannot combine them.

Can you modify the table structure to have an ImpressionsFlag? This would be 1 if there are any impressions and 0 otherwise. If so, then you can try an index on ads(ImpressionsFlag, Start).

If that helps with performance, the next step would be to break up the query into separate subqueries and bring them together using union all. The purpose is to design indexes to optimize the underlying queries.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Why can't I add an index to `impressions` as it is instead of adding `impressionsFlag`? – php_nub_qq Aug 31 '13 at 16:09
  • @php_nub_qq . . . You can. However, the index `ads(Impressions, Start)` probably won't be used for the two conditions, because the first is an inequality. The MySQL documentation actually does a good job explaining this (http://dev.mysql.com/doc/refman/5.5/en/mysql-indexes.html). – Gordon Linoff Aug 31 '13 at 16:21
  • Okay so of all you said 2 things are unclear to me, what are 'inequalities' and how can I have an index on 2 columns? I mean like you say `ads(Impressions, Start)` ? – php_nub_qq Aug 31 '13 at 16:58
  • 1
    It's quite funny how now, 1 year later, I'm faced with the same task. Of course now my knowledge is light years ahead from what it was back then when I asked the question ( or so I think ). What is actually funny is that now I understand what you meant and it is what I actually ended up doing on mu own. And why I hadn't accepted and upvoted the answer - that I do not know. Anyway one thank you seems to be missing, so thank you! `:)` – php_nub_qq Sep 25 '14 at 21:26