0

MY Problem is that the join queries take around 7-15 seconds for 1,5 million datasets and i want to reduce it significantly but don't really know how.

I have a Database Designed as Followed:

Tbl1: ID - Time1 - Time 2 - Time2

Tbl2: ID - Topic_ID

Tbl3: ID - Location_ID

Tbl4: ID - Status_ID

The first ID is always the same and there are allways keys on both Columns. It is easily possible to join 2 tables but as soon as i try to join a 3rd on like for example:

Select t1.Time1,t2.Topic_ID,t3_Location_id 
from
(select ID,Time1 from tbl1 where time > 400 and < 500 ) as t1
inner join tbl2 as t2 on t2.Id = t1.ID
inner join tbl3 as t3 on t2.id = t3.ID
where t2.topic_ID = 2

The Time Grows massively from 0.02 secs to 7-15 secs.

MY Problem now is that there is a 1:1 Relation for all tables execpt for table 2 where it is possible that one Id has more than one Topic.

I Hope I explained my Problem decently.

Thank you very much for you Help

user2359459
  • 212
  • 2
  • 11
  • And all ID columns are indexed? What about t2.topic? – jarlh Jul 03 '15 at 08:10
  • Yes Indexes everywhere possible – user2359459 Jul 03 '15 at 08:16
  • If it was Oracle I'd be recommending checking that the statistics for the tables are up to date. Can you do the equivalent for mysql? http://dev.mysql.com/doc/innodb/1.1/en/innodb-other-changes-statistics-estimation.html What are the data volumes like? – kayakpim Jul 03 '15 at 08:21
  • the time table is about 250 mb the rest is about 80 mb each – user2359459 Jul 03 '15 at 08:25
  • I should add that my ID is a variableChar of 16 characters – user2359459 Jul 03 '15 at 08:31
  • can you please add the output for an explain on your query? – Olli Jul 03 '15 at 08:59
  • I haven't got real life data concerning speed but I believe `int` as primary key would be faster and altogether better choice. See for example: http://stackoverflow.com/questions/7497421/using-varchar-as-the-primary-key-bad-idea-or-ok, http://stackoverflow.com/questions/19299874/can-i-use-varchar-as-the-primary-key – ZZ-bb Jul 03 '15 at 10:07

1 Answers1

1

Can you try the below query and check once. I just added a group by clause in your derived table. What I feel is that group by will give u the unique combination of ID and TIME and thus will reduce the record set generated in ur derived table, in turn reducing performance.

            Select t1.Time1,t2.Topic_ID,t3_Location_id 
        from
        (select ID,Time1 from tbl1 where time > 400 and < 500 group by        ID,Time1) as t1
        inner join tbl2 as t2 on t2.Id = t1.ID
        inner join tbl3 as t3 on t2.id = t3.ID
        where t2.topic_ID = 2
Aritra Bhattacharya
  • 720
  • 1
  • 7
  • 18