0

I have a table whose structure is -

CREATE TABLE `cdr` (`id` bigint(20) NOT NULL AUTO_INCREMENT,
                `dataPacketDownLink` bigint(20) DEFAULT NULL,
                `dataPacketUpLink` bigint(20) DEFAULT NULL,
                `dataPlanEndTime` datetime DEFAULT NULL,
                `dataPlanStartTime` datetime DEFAULT NULL,
                `dataVolumeDownLink` bigint(20) DEFAULT NULL,
                `dataVolumeUpLink` bigint(20) DEFAULT NULL,  
                `dataplan` varchar(255) DEFAULT NULL,  
                `dataplanType` varchar(255) DEFAULT NULL,  
                `createdOn` datetime DEFAULT NULL,  
                `deviceName` varchar(500) DEFAULT NULL,  
                `duration` int(11) NOT NULL,  
                `effectiveDuration` int(11) NOT NULL,  
                `hour` int(11) DEFAULT NULL,  
                `eventDate` datetime DEFAULT NULL,  
                `msisdn` bigint(20) DEFAULT NULL,  
                `quarter` int(11) DEFAULT NULL,  
                `validDays` int(11) DEFAULT NULL,  
                `dataLeft` bigint(20) DEFAULT NULL,  
                `completedOn` datetime DEFAULT NULL,   
            PRIMARY KEY (`id`),   
            KEY `msisdn_index` (`msisdn`),   
            KEY `eventdate_index` (`eventDate`)   
        ) ENGINE=MyISAM AUTO_INCREMENT=55925171 DEFAULT CHARSET=latin1

Now when I am trying to write this query its taking more than 1 minutes from 20 million records -

select c.msisdn,sum(c.dataVolumeDownLink+c.dataVolumeUpLink) as datasum from cdr c where c.eventDate>=<date> group by c.msisdn order by datasum desc;

In actuall I have 40+ millions of record.

Explain plan-

id  select_type  table  type  possible_keys    key  key_len  ref  rows      Extra                                         
1   SIMPLE       c      ALL   eventdate_index                     20000420  Using where; Using temporary; Using filesort  

I can't make partition, so tell how i can optimize this query. Thank you.

Aamir
  • 738
  • 2
  • 17
  • 41

1 Answers1

1

Optimization is a bit of an art.

Start by simply putting an index on your eventDate. That will probably get you very close, except for when you are trying to look for a very wide date range where you are trying to get almost all of the records at one time.

Other possibilities might include creating a combined index on msisdn and eventDate. Order in the index does matter, so indexing on msisdn then eventDate is different from indexing on eventDate then msisdn.

Then keep using the analyzer to see what is working and what isn't.

Brian Hoover
  • 7,861
  • 2
  • 28
  • 41
  • thank you brian... you are right actually i am looking for a very wide date range that why optimizer not choosing index and reading whole table(explain plan), but is it good to use composite index? because everytime i think about changing structure it took lot of time bcoz of 20M records. So plz give me more insight if its i can make combined index. And which analyzer do you use. Actually i dont have any so let me know . And thank you once again for ur time. :) – Aamir Nov 28 '13 at 14:54
  • I just use explain plan, the same as you. For troubleshooting, you might want to create a smaller version of the table and see what works before applying it to the 20M table, which will take a while to index. Since you are looking through all the data, you might be able to do something cool with a composite index of msisdn, eventDate, dataVolumeDownLink, dataVolumeUpLink which would be close to a materialized view. The other thing you want to think about is using this table as a transaction table and then creating a reporting table that mirrors this table with the data/information you need – Brian Hoover Nov 28 '13 at 14:59
  • ok, now all I can do is testing so I can sort out this problem by combined index.Thank you Brian for the help. FB like for ur answer(Y). Right now I am not marking your answer correct, bcoz I want to see more reply on this topic. – Aamir Nov 28 '13 at 15:16