0

JOIN query on below table which contains large amount of data. This executes very slow with java jdbc and jpa.

Use case UI has an option to select list of traders,category and date range.If i select 1 week date range and all traders from dropdown, will the below query joins records from 1 week of stock table with all the records from stock_trader?

            STOCK
------------------------------------------
stock_id long   | PRIMARY KEY AUTO_INCREMENT
stock_code  | varchar(10)
stock_name  |varchar(10)
created_date    |DATETIME
category    |varchar(10)

    STOCK_TRADER
-----------------------------------------
stock_id | long
trader_name | FULLTEXT 

SQL query

  Select * from STOCK st join stock_trader trad
     on st.stock_id=trad.stock_id where st.created_date<CURRENT_TIMESTAMP()
     and trad.trader_name in(:traderlist) 
group by st.stock_id

traderlist: list of traders selected from UI

Please comment on the performance gain if I use the data model below which adds an extra column in stock_trader and joining with multiple columns. Can i expect faster response of the below query which maps only 1 week of records from stock table with 1 week of traders from stock_trader.

  • Is there any way to minimize the processing time of query?And what would be the average execution time for 1 million data.
  • what if the below join is replaced with a view and applying where clauses on the view from java jdbc/jpa. I found view makes the execution faster but am worried that it will join the whole dataset as where clauses are appllied dynamically from UI. If so performance would be much slower right?

                  STOCK_TRADER
    -----------------------------------------
    stock_id    |   long
    trader_name     |   FULLTEXT 
    created_date    |   DATETIME
    

    SQL QUERY

    Select * from STOCK st join stock_trader trad 
        on st.stock_id=trad.stock_id and 
        st.created_date =trad.created_date where st.created_date<CURRENT_TIMESTAMP()
        and trad.trader_name in(:traderlist) 
    group by st.stock_id
    
user5303259
  • 27
  • 3
  • 9
  • 1
    Do you really need `created_date < CURRENT_TIMESTAMP()`? Can you create a trade in the future? If you need it, make sure you have an index on `created_date`. Also make sure you have an index on `stock_trader.stock_id`. – Barmar Apr 29 '17 at 14:30
  • 1
    Adding an extra column to the joining condition cannot possibly improve performance. A properly normalized schema should not duplicate information like that. – Barmar Apr 29 '17 at 14:32
  • your cost is probably more on the `group by st.stock_id` than on the join if you have correct index. Please provide us your `EXPLAIN` 'ed query plan. BTW, do you really need to query your whole table ? don't you have any where to restrict a bit the output ? – Blag Apr 29 '17 at 15:01
  • i have added the complete scenario. please let me know if u need more details – user5303259 Apr 30 '17 at 11:41
  • Double check the generated SQL -- you need `IN(1,2,3)`, not `IN('1,2,3')`. – Rick James Apr 30 '17 at 18:30
  • It is _wrong_ to `SELECT * ... GROUP BY stock_id`. If there are multiple rows for each `stock_id`, you get random values for the rest of `*`. – Rick James Apr 30 '17 at 18:32
  • 1
    `VIEWs` are syntactic sugar; they do not help performance. – Rick James Apr 30 '17 at 18:32
  • 1
    Yes we "need more details": `SHOW CREATE TABLE` and `EXPLAIN SELECT ...` – Rick James Apr 30 '17 at 18:34

0 Answers0