MySQL Query Optimization Techniques
A slow database query can bring an entire application to its knees. You might have perfectly written application code, but if your MySQL queries are inefficient, users will suffer with sluggish load times and you'll face escalating server costs. The good news: most query performance problems are fixable, and fixing them often yields dramatic improvements — not 10% faster, but 10x to 100x faster. This guide walks you through the most impactful MySQL optimization techniques every developer should know.
In most applications, 20% of queries are responsible for 80% of database load. Before optimizing anything, identify your slow queries using the MySQL slow query log or a profiling tool. Target the highest-impact queries first rather than optimizing everything blindly.
Understanding Query Execution with EXPLAIN
Before you can optimize a query, you need to understand how MySQL is actually executing it. The EXPLAIN statement is your most powerful diagnostic tool — it reveals the execution plan MySQL's query optimizer has chosen, including which indexes are used, how tables are joined, and how many rows MySQL estimates it needs to examine.
-- Basic EXPLAIN
EXPLAIN SELECT * FROM orders
WHERE customer_id = 42
AND status = 'pending'
ORDER BY created_at DESC;
-- EXPLAIN FORMAT=JSON gives more detail
EXPLAIN FORMAT=JSON SELECT * FROM orders
WHERE customer_id = 42;
-- EXPLAIN ANALYZE actually runs the query and shows real statistics
-- (MySQL 8.0.18+)
EXPLAIN ANALYZE SELECT * FROM orders
WHERE customer_id = 42;
Reading EXPLAIN Output
The key columns to focus on in EXPLAIN output are:
| Column | What It Tells You | What to Watch For |
|---|---|---|
type |
Join/access type — how MySQL scans the table | Avoid ALL (full table scan); aim for ref, eq_ref, or const |
key |
Which index MySQL chose to use | NULL means no index used — likely needs one |
rows |
Estimated number of rows MySQL will examine | High row count relative to actual result set = inefficient query |
Extra |
Additional info about the operation | Using filesort, Using temporary indicate costly operations |
possible_keys |
Indexes MySQL considered | If NULL, no indexes are applicable — add one |
The access type hierarchy from best to worst: system → const → eq_ref → ref → range → index → ALL. A type: ALL means MySQL is reading every row in the table — almost always something you want to fix.
MySQL's query execution pipeline — EXPLAIN reveals the optimizer's decisions
Indexing: The Foundation of Query Performance
Indexes are the single most impactful optimization you can apply to MySQL. An index is a data structure that allows MySQL to find rows without scanning every row in the table — similar to a book index vs. reading every page. Without indexes, MySQL performs full table scans which become catastrophically slow as data grows.
Index Types and When to Use Them
-- Single-column index: good for simple equality lookups
CREATE INDEX idx_customer_id ON orders(customer_id);
-- Composite index: covers multiple WHERE conditions
-- Column order matters! Put most selective column first,
-- or put equality columns before range columns
CREATE INDEX idx_customer_status ON orders(customer_id, status);
-- Covering index: includes all columns needed by the query
-- MySQL can satisfy the query entirely from the index (no table lookup)
CREATE INDEX idx_covering ON orders(customer_id, status, created_at, total);
-- Prefix index for large VARCHAR columns
CREATE INDEX idx_email_prefix ON users(email(20));
-- Unique index: prevents duplicates AND speeds up lookups
CREATE UNIQUE INDEX idx_unique_email ON users(email);
-- Check existing indexes
SHOW INDEX FROM orders;
-- Check if a query uses an index efficiently
EXPLAIN SELECT * FROM orders WHERE customer_id = 42 AND status = 'pending';
Composite Index Column Ordering
The order of columns in a composite index critically affects performance. MySQL can only use a composite index from the leftmost column forward — this is the "leftmost prefix rule." A composite index on (customer_id, status, created_at) can be used for:
- Queries filtering on
customer_idalone - Queries filtering on
customer_id+status - Queries filtering on all three columns
- But NOT for queries filtering on
statusalone orcreated_atalone
Indexes slow down INSERT, UPDATE, and DELETE operations because MySQL must maintain them. Don't index every column — focus on columns used in WHERE, JOIN ON, and ORDER BY clauses. Also, wrapping a column in a function (e.g., WHERE YEAR(created_at) = 2026) prevents index usage — rewrite as a range condition instead.
Fixing Function-Based Index Killing
-- BAD: Function call prevents index usage
SELECT * FROM orders WHERE YEAR(created_at) = 2026;
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';
SELECT * FROM products WHERE price + 10 > 100;
-- GOOD: Rewrite to allow index usage
SELECT * FROM orders
WHERE created_at >= '2026-01-01' AND created_at < '2027-01-01';
-- For case-insensitive email: use a case-insensitive collation on the column
-- ALTER TABLE users MODIFY email VARCHAR(255) COLLATE utf8mb4_unicode_ci;
-- Then a simple equality works with the index:
SELECT * FROM users WHERE email = 'user@example.com';
SELECT * FROM products WHERE price > 90; -- functionally equivalent, uses index
JOIN Optimization
JOINs are powerful but can be expensive if not written carefully. MySQL uses nested-loop joins — for each row in the driving table, it searches the joined table. If the joined table has no index on the join column, this becomes an O(n×m) full scan.
Always Index Join Columns
-- Ensure foreign key columns are indexed
-- The orders.customer_id column should have an index
-- MySQL automatically creates one for declared FOREIGN KEYs
CREATE INDEX idx_orders_customer ON orders(customer_id);
CREATE INDEX idx_order_items_order ON order_items(order_id);
-- GOOD JOIN: both join columns are indexed
SELECT
c.name,
COUNT(o.id) AS order_count,
SUM(o.total) AS total_spent
FROM customers c
INNER JOIN orders o ON o.customer_id = c.id -- indexed on both sides
WHERE o.status = 'completed'
AND o.created_at >= '2026-01-01'
GROUP BY c.id, c.name
ORDER BY total_spent DESC
LIMIT 10;
-- Avoid joining on expressions or functions
-- BAD:
SELECT * FROM orders o
JOIN order_archive oa ON DATE(o.created_at) = DATE(oa.created_at);
-- GOOD: Compare raw column values
SELECT * FROM orders o
JOIN order_archive oa ON o.created_at = oa.created_at;
Subquery vs JOIN: Choosing Wisely
In MySQL 5.x, correlated subqueries were notoriously slow. In MySQL 8.0 the optimizer is smarter, but JOINs are often still faster for large datasets because they give the optimizer more flexibility in choosing the execution order.
-- SLOW: Correlated subquery runs once per row in customers
SELECT c.name
FROM customers c
WHERE (SELECT COUNT(*) FROM orders o WHERE o.customer_id = c.id) > 5;
-- FAST: JOIN + GROUP BY runs once
SELECT c.name
FROM customers c
INNER JOIN orders o ON o.customer_id = c.id
GROUP BY c.id, c.name
HAVING COUNT(o.id) > 5;
-- For NOT IN / NOT EXISTS, use LEFT JOIN ... IS NULL
-- (often faster than NOT IN with large sets)
-- SLOW:
SELECT * FROM customers WHERE id NOT IN (SELECT customer_id FROM orders);
-- FAST:
SELECT c.*
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
WHERE o.id IS NULL;
Writing Efficient WHERE Clauses
How you write your WHERE conditions has a massive impact on whether MySQL can use indexes and how many rows it must examine.
Use equality before range conditions in composite indexes
In a composite index (status, created_at), filter on status = 'active' first, then use a range on created_at. Putting the range column first wastes index efficiency.
Avoid leading wildcards in LIKE
WHERE name LIKE 'John%' can use an index. WHERE name LIKE '%John%' cannot — it forces a full scan. For full-text search needs, use MySQL's FULLTEXT indexes instead.
Use IN instead of OR for multiple values
WHERE status IN ('pending', 'processing') is typically more efficient than WHERE status = 'pending' OR status = 'processing' and is cleaner to read.
Avoid implicit type conversions
If user_id is an INT, querying WHERE user_id = '42' (string) forces MySQL to cast every row, preventing index use. Always match the data type of your literals to the column type.
-- Good: Uses index on (status, created_at)
SELECT * FROM orders
WHERE status = 'pending' -- equality first
AND created_at >= '2026-01-01' -- range second
AND created_at < '2026-04-01';
-- Good: Prefix LIKE uses index
SELECT * FROM users WHERE email LIKE 'mayur%';
-- Bad: Leading wildcard can't use index
-- SELECT * FROM users WHERE email LIKE '%@gmail.com';
-- Good: IN clause
SELECT * FROM products WHERE category_id IN (1, 3, 7, 12);
-- Bad: implicit type conversion (user_id is INT)
-- SELECT * FROM users WHERE user_id = '42';
-- Good: matched types
SELECT * FROM users WHERE user_id = 42;
-- Good: EXISTS is often faster than IN with large subqueries
SELECT * FROM customers c
WHERE EXISTS (
SELECT 1 FROM orders o
WHERE o.customer_id = c.id
AND o.status = 'completed'
);
ORDER BY, GROUP BY, and LIMIT Optimization
Sorting and grouping operations can be expensive because MySQL may need to build a temporary table and sort it — the dreaded Using filesort in EXPLAIN output. With the right indexes, MySQL can return sorted data directly from the index without any extra work.
Eliminating Filesort with Indexes
-- With this index, ORDER BY can be served from the index
CREATE INDEX idx_customer_created ON orders(customer_id, created_at DESC);
-- MySQL can now serve this query entirely from the index
-- (no filesort, no temporary table)
SELECT id, total, created_at
FROM orders
WHERE customer_id = 42
ORDER BY created_at DESC
LIMIT 20;
-- For pagination: use keyset pagination instead of OFFSET for large pages
-- BAD: OFFSET 10000 makes MySQL scan and discard 10000 rows
SELECT * FROM orders ORDER BY id DESC LIMIT 20 OFFSET 10000;
-- GOOD: Keyset pagination - cursor-based, fast regardless of page depth
SELECT * FROM orders
WHERE id < 98765 -- last seen id from previous page
ORDER BY id DESC
LIMIT 20;
Optimizing GROUP BY Aggregations
-- Add index that covers GROUP BY column and aggregate target
CREATE INDEX idx_status_total ON orders(status, total);
-- Count query: can now be answered from the index alone
SELECT status, COUNT(*) AS cnt, SUM(total) AS revenue
FROM orders
GROUP BY status;
-- Use HAVING to filter groups, not WHERE (WHERE filters rows before grouping)
SELECT customer_id, COUNT(*) AS order_count
FROM orders
WHERE created_at >= '2026-01-01' -- filter rows first (uses index)
GROUP BY customer_id
HAVING COUNT(*) >= 3; -- filter groups after aggregation
-- For dashboard summary stats, consider summary tables
-- updated periodically instead of running heavy aggregations on demand
CREATE TABLE order_daily_summary (
date DATE PRIMARY KEY,
order_count INT NOT NULL DEFAULT 0,
revenue DECIMAL(12,2) NOT NULL DEFAULT 0.00
);
A covering index contains all the columns a query needs — MySQL can answer the query by reading only the index without touching the actual table rows (heap). In EXPLAIN, you'll see Using index in the Extra column. For read-heavy queries that run millions of times, a well-designed covering index can reduce query time by 5–20x.
MySQL Configuration Tuning
Even perfectly written queries can be slow if MySQL's server configuration is under-tuned. These server variables have the most impact on query performance for typical web applications.
| Variable | Default | Recommended Starting Point | Purpose |
|---|---|---|---|
innodb_buffer_pool_size |
128M | 70–80% of total RAM | Caches data and indexes in memory — most important setting |
innodb_io_capacity |
200 | 1000–2000 (SSD) | Tells InnoDB how fast your disk is |
query_cache_type |
0 (off) | 0 (keep off) | MySQL 8.0 removed query cache — use application-level caching |
max_connections |
151 | Match your connection pool size | Max simultaneous connections |
slow_query_log |
OFF | ON (in production) | Logs queries slower than long_query_time |
long_query_time |
10 | 1 or 0.5 | Threshold (seconds) for slow query logging |
[mysqld]
# Set to 70-80% of server RAM (example: 8GB server)
innodb_buffer_pool_size = 6G
# For SSD storage
innodb_io_capacity = 2000
innodb_io_capacity_max = 4000
# Enable slow query logging to find problems
slow_query_log = ON
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 1
log_queries_not_using_indexes = ON
# Avoid too many temp tables being written to disk
tmp_table_size = 64M
max_heap_table_size = 64M
# Connection settings
max_connections = 200
wait_timeout = 300
interactive_timeout = 300
Finding Slow Queries
-- Find the top slow queries from Performance Schema
SELECT
digest_text,
count_star AS executions,
ROUND(avg_timer_wait / 1e9, 2) AS avg_ms,
ROUND(sum_timer_wait / 1e9, 2) AS total_ms,
sum_rows_examined AS rows_examined,
sum_rows_sent AS rows_sent
FROM performance_schema.events_statements_summary_by_digest
ORDER BY sum_timer_wait DESC
LIMIT 10;
-- Check tables with no primary key (very slow for joins/updates)
SELECT table_schema, table_name
FROM information_schema.tables t
WHERE table_schema NOT IN ('mysql','information_schema','performance_schema','sys')
AND NOT EXISTS (
SELECT 1 FROM information_schema.statistics s
WHERE s.table_schema = t.table_schema
AND s.table_name = t.table_name
AND s.index_name = 'PRIMARY'
);
-- Use pt-query-digest (Percona Toolkit) to analyze slow query log
-- pt-query-digest /var/log/mysql/slow.log | head -100
Advanced Techniques
Partitioning for Large Tables
Table partitioning splits a large table into physical sections based on a partition key. When your query includes the partition key in its WHERE clause, MySQL only scans the relevant partitions (partition pruning), dramatically reducing I/O for time-series and archival data.
-- Partition a large events table by year
CREATE TABLE events (
id BIGINT AUTO_INCREMENT,
user_id INT NOT NULL,
event_type VARCHAR(50) NOT NULL,
created_at DATETIME NOT NULL,
payload JSON,
PRIMARY KEY (id, created_at) -- partition key must be in PK
)
PARTITION BY RANGE (YEAR(created_at)) (
PARTITION p2023 VALUES LESS THAN (2024),
PARTITION p2024 VALUES LESS THAN (2025),
PARTITION p2025 VALUES LESS THAN (2026),
PARTITION p2026 VALUES LESS THAN (2027),
PARTITION pmax VALUES LESS THAN MAXVALUE
);
-- Query with partition key: MySQL only scans p2026
SELECT * FROM events
WHERE created_at >= '2026-01-01'
AND created_at < '2027-01-01'
AND event_type = 'purchase';
-- Check partition pruning with EXPLAIN PARTITIONS
EXPLAIN PARTITIONS SELECT * FROM events
WHERE created_at >= '2026-01-01';
-- Drop old partition (instant, no row-by-row DELETE)
ALTER TABLE events DROP PARTITION p2023;
Using Generated Columns for Computed Indexes
-- Problem: You need to filter by year of created_at frequently
-- but YEAR(created_at) in WHERE prevents index use
-- Solution: Virtual generated column + index
ALTER TABLE orders
ADD COLUMN created_year SMALLINT AS (YEAR(created_at)) VIRTUAL,
ADD INDEX idx_created_year (created_year);
-- Now this query uses the index!
SELECT * FROM orders WHERE created_year = 2026;
-- JSON field indexing with generated columns
ALTER TABLE products
ADD COLUMN brand VARCHAR(100) AS (JSON_UNQUOTE(metadata->>'$.brand')) STORED,
ADD INDEX idx_brand (brand);
SELECT * FROM products WHERE brand = 'Sony';
MySQL Optimization Checklist
| Area | Checklist Item | Priority |
|---|---|---|
| Indexing | All foreign keys are indexed | Critical |
| Indexing | Composite indexes ordered with equality before range columns | High |
| Queries | No functions wrapping indexed columns in WHERE clauses | High |
| Queries | LIKE patterns don't start with a wildcard | High |
| Pagination | Large-offset pagination replaced with keyset/cursor pagination | Medium |
| Config | innodb_buffer_pool_size set to 70–80% of RAM | Critical |
| Monitoring | Slow query log enabled and regularly reviewed | High |
| Schema | All tables have a primary key | Critical |
| Schema | Column types are the smallest that fit the data | Medium |
| Joins | JOIN columns are indexed on both tables | High |
Putting It All Together
MySQL query optimization is not a one-time task — it's an ongoing process as your data grows and usage patterns change. The key is to measure first, then optimize. Use EXPLAIN to understand how queries execute, enable the slow query log to find problems in production, and profile before and after each change to verify the improvement.
Key Takeaways
- EXPLAIN is your best friend: Run it before changing anything. Look for
type: ALLandUsing filesortas top targets. - Indexes drive 80% of query improvement: Make sure foreign keys, WHERE columns, and JOIN columns are indexed. Composite indexes beat multiple single-column indexes for multi-condition queries.
- Column order in composite indexes matters: Put equality-filtered columns before range-filtered ones; follow the leftmost prefix rule.
- Avoid defeating indexes: Don't wrap indexed columns in functions, use leading wildcards, or create implicit type mismatches.
- Use keyset pagination: Replace
OFFSETwith cursor-based pagination for deep pages — it's orders of magnitude faster. - Tune innodb_buffer_pool_size: This single setting has the biggest impact on InnoDB read performance. Set it to 70–80% of available RAM.
- Partition large tables: For time-series data exceeding 50–100M rows, partitioning lets MySQL skip irrelevant partitions entirely.
"The best query is the one that reads the fewest rows to get the right answer. Indexes make that possible."
Start with the slow query log, pick your worst offenders, run EXPLAIN, add the right indexes, and measure. Repeat. Most applications see their database bottlenecks resolve dramatically with just a handful of well-placed indexes and a few query rewrites. The techniques in this guide give you everything you need to make that happen systematically.
