Questions tagged [query-optimization]

For questions about making database queries run faster. PLEASE INCLUDE QUERY PLANS, AND TABLE AND INDEX DEFINITIONS in your question. Please use additional tags to specify your query language and server maker. Some of these questions may belong on https://DBA.stackexchange.com, especially if they involve reindexing, query plans, etc.

Please include table and index definitions for all relevant tables in your question. Please also include the query plan: the EXPLAIN output. Read on for more explanation.

If you use this is a good guide to pulling together the right information for your question.

SQL is a declarative language that can have many ways to find the same data. As a SQL programmer, you declare what you want from it, and it figures out how to get your result. Because of how databases are structured, and how they retrieve data, the same results can be retrieved in various ways that the query interpreter needs to choose between.

Using sargable predicates, using indexes, and optimizing how the query works can make it run more quickly. Restructuring the query, removing unnecessary portions of the query, or using different features of different Database Engines can all be helpful.

If the question is not (just) about how to write the query, but involves how to create new indexes, or how to optimize the server, you may get good answers on https://DBA.stackexchange.com.

Your database's query planner, also known as its query optimizer module attempts to determine the most efficient way to execute a given query by considering possible query plans.

You'll hear a lot about query planners in this tag. Each DBMS software product has its own query planner. Here's a description of one query planner.

How to ask a good query-optimization question

If you're asking "Why is my SQL query so slow?" or something like that, you need to do a little extra homework to get good answers from the Stack Overflow community. Please include this information.

What database and version?

Put this tag on your question along with the tag for your database: , . , or whatever database you use. Don't use more than one database tag, please. Optimization works somewhat differently from database to database. And, please mention the version of your database server.

How many rows and how fast?

Please figure out approximately how many rows you expect in your result set, and approximately how many rows are in each table involved with the query. Mention those numbers in your question. Please mention how long you hope your query needs to take, and how long it's taking. (That is, tell us what you mean by slow).

Don't oversimplify

Don't try to simplify your question by replacing your actual SELECT column lists with *. Using * in SELECT clauses can be harmful to query performance compared to specifying the list of columns you need.

Show your table definitions, with indexes

In your question include your table definitions and index definitions for each table and view involved in your query. Make sure your question shows all indexes in all relevant tables. When you're asking a performance question, please don't oversimplify your question by omitting indexes or columns you think might be irrelevant. The strangest things can be relevant to performance!

Very important: the names and data types of columns included in each index, with their order.

Show your database's explanation of how it satisfied your query

In your question please show the EXPLAIN PLAN output (it's just EXPLAIN or SHOWPLAN in some table server makes and models) for your query.

Things to know

  • Creating an appropriate index for a slow query is often, not always, a good solution to a problem. Answers on will often contain a suggestion for an index.

  • Avoid creating indexes on lots of single columns. Single-column indexes are often not very useful to speed up complex queries.

  • As your database grows you will, most likely, discover new slow queries. Operations can be trivial on hundreds of rows, but much more time-consuming on tens of thousands of rows. In a growing database, plan to review performance issues routinely.

  • If you use SSMS -- Microsoft's SQL Server Management Studio -- this tip is for you: Right-click in a query window, then select Show Actual Execution Plan, then run the query. The execution plan display sometimes recommends a new index to create.

Things you might do when optimizing queries

  1. Join ordering
  2. Query planning for nested SQL queries
  3. Cost estimation
  4. Parametric Query Optimization
  5. Multi-Objective Query Optimization
  6. Multi-Objective Parametric Query Optimization

Reference

Please be aware that query performance optimization is an inexact science. This is a good reference. http://use-the-index-luke.com/

7749 questions
17
votes
4 answers

Optimizing Oracle CONNECT BY when used with WHERE clause

Oracle START WITH ... CONNECT BY clause is applied before applying WHERE condition in the same query. Thus, WHERE constraints won't help optimize CONNECT BY. For example, the following query will likely perform full table scan (ignoring selectivity…
topchef
  • 19,091
  • 9
  • 63
  • 102
16
votes
2 answers

bson.D vs bson.M for find queries

This specifc question is in relation to using mongodb with the golang package mongo-driver, but I would assume this applies across most interfaces with mongodb. When using Find to query some data from a collection, we can use both the bson.M- and…
abcalphabet
  • 1,158
  • 3
  • 16
  • 31
16
votes
6 answers

Mysql Improve Search Performance with wildcards (%%)

Below is a query I use for searching a person by email SELECT * FROM phppos_customers JOIN phppos_people ON phppos_customers.person_id = phppos_people.person_id WHERE deleted = 0 AND email LIKE '%f%' ORDER BY email ASC Will adding…
Chris Muench
  • 17,444
  • 70
  • 209
  • 362
16
votes
2 answers

Left join or select from multiple table using comma (,)

I'm curious as to why we need to use LEFT JOIN since we can use commas to select multiple tables. What are the differences between LEFT JOIN and using commas to select multiple tables. Which one is faster? Here is my code: SELECT mw.*, …
bbnn
  • 3,505
  • 10
  • 50
  • 68
16
votes
10 answers

optimize mysql count query

Is there a way to optimize this further or should I just be satisfied that it takes 9 seconds to count 11M rows ? devuser@xcmst > mysql --user=user --password=pass -D marctoxctransformation -e "desc record_updates" …
andersonbd1
  • 5,266
  • 14
  • 44
  • 62
16
votes
1 answer

Why doesn't SQL Server use the index on the computed column?

Given the following in a SQL Server 2014 DB: create table t ( c1 int primary key, c2 datetime2(7), c3 nvarchar(20), c4 as cast(dbo.toTimeZone(c2, c3, 'UTC') as date) persisted ); create index i on t (c4); declare @i int =…
16
votes
3 answers

Persistence Database(MySQL/MongoDB/Cassandra/BigTable/BigData) Vs Non-Persistence Array (PHP/PYTHON)

How beneficial will it be to use Python/PHP Nonpersistent array for storing 6GB+ data with 800+ million rows in RAM, rather than using MySQL/MongoDB/Cassandra/BigTable/BigData(Persistence Database) database when it comes to speed/latency in simple…
16
votes
2 answers

Spark SQL: how to cache sql query result without using rdd.cache()

Is there any way to cache a cache sql query result without using rdd.cache()? for examples: output = sqlContext.sql("SELECT * From people") We can use output.cache() to cache the result, but then we cannot use sql query to deal with it. So I want…
lwwwzh
  • 225
  • 1
  • 2
  • 9
16
votes
5 answers

T-SQL Where Clause Case Statement Optimization (optional parameters to StoredProc)

I've been battling this one for a while now. I have a stored proc that takes in 3 parameters that are used to filter. If a specific value is passed in, I want to filter on that. If -1 is passed in, give me all. I've tried it the following two…
IronicMuffin
  • 4,182
  • 12
  • 47
  • 90
16
votes
4 answers

SQL Wildcard Search - Efficiency?

There has been a debate at work recently at the most efficient way to search a MS SQL database using LIKE and wildcards. We are comparing using %abc%, %abc, and abc%. One person has said that you should always have the wildcard at the end of the…
15
votes
2 answers

How is ColumnarToRow an efficient operation in Spark

In my understanding columnar format is better for MapReduce tasks. Even for something like selection of some columns, columnar works well as we don't have to load other columns into memory. But in Spark 3.0 I'm seeing this ColumnarToRow operation…
kar09
  • 411
  • 3
  • 11
15
votes
2 answers

Index scan backward vs index scan

While troubleshooting a server with very high I/O wait, I notice there is a lot of I/O coming from queries which do SELECT max(x) FROM t WHERE y = ?. My index is btree (x, y). I notice that the query plan does Index Scan Backward to get the max. Is…
ibz
  • 44,461
  • 24
  • 70
  • 86
15
votes
2 answers

Weighted average calculation in MySQL?

I am currently using the following query to get some numbers: SELECT gid, count(gid), (SELECT cou FROM size WHERE gid = infor.gid) FROM infor WHERE id==4325 GROUP BY gid; The output I am getting at my current stage is the…
Legend
  • 113,822
  • 119
  • 272
  • 400
15
votes
4 answers

What is the best way to implement a substring search in SQL?

We have a simple SQL problem here. In a varchar column, we wanted to search for a string anywhere in the field. What is the best way to implement this for performance? Obviously an index is not going to help here, any other tricks? We are using…
erotsppa
  • 14,248
  • 33
  • 123
  • 181
15
votes
3 answers

Why does this sql query do a key lookup?

I have a table User with a bunch of indexes. One of them is a unique index on the AccountIdentifier column. Since this is a unique index, why is a key lookup required in addition to the index seek? The index seek tooltip reports that only one record…
BC.
  • 24,298
  • 12
  • 47
  • 62