Questions tagged [query-optimization]

For questions about making database queries run faster. PLEASE INCLUDE QUERY PLANS, AND TABLE AND INDEX DEFINITIONS in your question. Please use additional tags to specify your query language and server maker. Some of these questions may belong on https://DBA.stackexchange.com, especially if they involve reindexing, query plans, etc.

Please include table and index definitions for all relevant tables in your question. Please also include the query plan: the EXPLAIN output. Read on for more explanation.

If you use this is a good guide to pulling together the right information for your question.

SQL is a declarative language that can have many ways to find the same data. As a SQL programmer, you declare what you want from it, and it figures out how to get your result. Because of how databases are structured, and how they retrieve data, the same results can be retrieved in various ways that the query interpreter needs to choose between.

Using sargable predicates, using indexes, and optimizing how the query works can make it run more quickly. Restructuring the query, removing unnecessary portions of the query, or using different features of different Database Engines can all be helpful.

If the question is not (just) about how to write the query, but involves how to create new indexes, or how to optimize the server, you may get good answers on https://DBA.stackexchange.com.

Your database's query planner, also known as its query optimizer module attempts to determine the most efficient way to execute a given query by considering possible query plans.

You'll hear a lot about query planners in this tag. Each DBMS software product has its own query planner. Here's a description of one query planner.

How to ask a good query-optimization question

If you're asking "Why is my SQL query so slow?" or something like that, you need to do a little extra homework to get good answers from the Stack Overflow community. Please include this information.

What database and version?

Put this tag on your question along with the tag for your database: , . , or whatever database you use. Don't use more than one database tag, please. Optimization works somewhat differently from database to database. And, please mention the version of your database server.

How many rows and how fast?

Please figure out approximately how many rows you expect in your result set, and approximately how many rows are in each table involved with the query. Mention those numbers in your question. Please mention how long you hope your query needs to take, and how long it's taking. (That is, tell us what you mean by slow).

Don't oversimplify

Don't try to simplify your question by replacing your actual SELECT column lists with *. Using * in SELECT clauses can be harmful to query performance compared to specifying the list of columns you need.

Show your table definitions, with indexes

In your question include your table definitions and index definitions for each table and view involved in your query. Make sure your question shows all indexes in all relevant tables. When you're asking a performance question, please don't oversimplify your question by omitting indexes or columns you think might be irrelevant. The strangest things can be relevant to performance!

Very important: the names and data types of columns included in each index, with their order.

Show your database's explanation of how it satisfied your query

In your question please show the EXPLAIN PLAN output (it's just EXPLAIN or SHOWPLAN in some table server makes and models) for your query.

Things to know

  • Creating an appropriate index for a slow query is often, not always, a good solution to a problem. Answers on will often contain a suggestion for an index.

  • Avoid creating indexes on lots of single columns. Single-column indexes are often not very useful to speed up complex queries.

  • As your database grows you will, most likely, discover new slow queries. Operations can be trivial on hundreds of rows, but much more time-consuming on tens of thousands of rows. In a growing database, plan to review performance issues routinely.

  • If you use SSMS -- Microsoft's SQL Server Management Studio -- this tip is for you: Right-click in a query window, then select Show Actual Execution Plan, then run the query. The execution plan display sometimes recommends a new index to create.

Things you might do when optimizing queries

  1. Join ordering
  2. Query planning for nested SQL queries
  3. Cost estimation
  4. Parametric Query Optimization
  5. Multi-Objective Query Optimization
  6. Multi-Objective Parametric Query Optimization

Reference

Please be aware that query performance optimization is an inexact science. This is a good reference. http://use-the-index-luke.com/

7749 questions
10
votes
2 answers

Optimal query to fetch a cumulative sum in MySQL

What is 'correct' query to fetch a cumulative sum in MySQL? I've a table where I keep information about files, one column list contains the size of the files in bytes. (the actual files are kept on disk somewhere) I would like to get the cumulative…
Jacco
  • 23,534
  • 17
  • 88
  • 105
10
votes
1 answer

RethinkDB - Find documents with missing field

I'm trying to write the most optimal query to find all of the documents that do not have a specific field. Is there any better way to do this than the examples I have listed below? // Get the ids of all documents missing…
vimfluencer
  • 3,106
  • 3
  • 17
  • 25
10
votes
2 answers

Does MySQL eliminate common subexpressions between SELECT and HAVING/GROUP BY clause

I often see people answer MySQL questions with queries like this: SELECT DAY(date), other columns FROM table GROUP BY DAY(date); SELECT somecolumn, COUNT(*) FROM table HAVING COUNT(*) > 1; I always like to give the column an alias and refer to…
Barmar
  • 741,623
  • 53
  • 500
  • 612
10
votes
1 answer

MySQL queries are very slow even using INNER JOIN instead of IN

I have three tables:product (10k records), product_attribute (4k records), and product_value (2m records). product and product_attribute are connected via product_value. I want to retrieve a specific product (e.g., product_id = 001) and its…
user3138073
  • 225
  • 1
  • 2
  • 7
10
votes
5 answers

SQLite3 query optimization join vs subselect

I am trying to figure out the very best way, (probably doesn't matter in this case) to find the rows of one table, based on the existence of a flag, and an relational id in a row in another table. here are the schemas: CREATE TABLE files ( id…
Grady Player
  • 14,399
  • 2
  • 48
  • 76
10
votes
1 answer

Why cardinality value in mysql indexes don't equal distinct count for column values

Not a long ago I started to optimize queries for mysql db. I created some indexes and decided to see their params, using show index from syntax. And saw that cardinality doesn't equal distinct count for some columns. So, I have the table CREATE…
Andrej Soroj
  • 1,103
  • 1
  • 9
  • 10
10
votes
1 answer

Setting enable_seqscan = off in a single SELECT query

After several tests on one of my queries (Postgres) I realized that by setting enable_seqscan = off, the query takes 1/3 of its original time (done using the psql console and EXPLAIN ANALYZE) Since it is not recommendable to change this setting for…
juakonn
  • 155
  • 1
  • 1
  • 9
10
votes
3 answers

Mysql optimization for REGEXP

This query (with different name instead of "jack") happens many times in my slow query log. Why? The Users table has many fields (more than these three I've selected) and about 40.000 rows. select name,username,id from Users where ( name…
Pons
  • 1,747
  • 1
  • 13
  • 19
9
votes
0 answers

Building a 1,000M row MySQL table

reposted on serverfault Questions Question 1: as the size of the database table gets larger how can I tune MySQL to increase the speed of the LOAD DATA INFILE call? Question 2: would using a cluster of computers to load different csv files, improve…
Ben
  • 1,030
  • 10
  • 23
9
votes
1 answer

Select 10 times slower when reversing order

Why would this select, call it A, (0.02406s): select * from char_kills where rid_first <= 110 and rid_last >= 110 order by kills desc limit 500; be 10 times slower than when sort order is reversed, call it B, (0.00229s): select * from…
Qtax
  • 33,241
  • 9
  • 83
  • 121
9
votes
2 answers

Same query - different execution plans

SQL 2008. I have a test table: create table Sale ( SaleId int identity(1, 1) constraint PK_Sale primary key, Test1 varchar(10) null, RowVersion rowversion not null constraint UQ_Sale_RowVersion unique ) I populate it…
Alex Aza
  • 76,499
  • 26
  • 155
  • 134
9
votes
1 answer

Hive Map-Join configuration mystery

Could someone clearly explain what is the difference between hive.auto.convert.join and hive.auto.convert.join.noconditionaltask configuration parameters? Also these corresponding size parameters: hive.mapjoin.smalltable.filesize and…
leftjoin
  • 36,950
  • 8
  • 57
  • 116
9
votes
4 answers

60 million entries, select entries from a certain month. How to optimize database?

I have a database with 60 million entries. Every entry contains: ID DataSourceID Some Data DateTime I need to select entries from certain month. Each month contains approximately 2 million entries. select * from Entries where time…
JBeurer
  • 1,707
  • 3
  • 19
  • 38
9
votes
1 answer

Is query cost the best metric for MySQL query optimization?

I am in the process of optimizing the queries in my MySQL database. While using Visual Explain and looking at various query costs, I'm repeatedly finding counter-intuitive values. Operations which use more efficient lookups (e.g. key lookup) seem…
Haydentech
  • 1,096
  • 1
  • 11
  • 27
9
votes
5 answers

Optimize Postgres deletion of orphaned records

Take the following two tables: Table "public.contacts" Column | Type | Modifiers | Storage | Stats target |…
user1032752
  • 751
  • 1
  • 11
  • 28