MySQL: High-Performance Relational Databases
A deep technical guide to MySQL performance at scale. Covers schema design, normalization, indexing strategies, query optimization with EXPLAIN, connection pooling, replication, backup, partitioning, stored procedures, transactions, isolation levels, TypeORM and Prisma ORM integration, JSON columns, window functions, and CTEs.
Table of Contents
- 1. Schema Design and Normalization
- 2. Indexing Strategies
- 3. Query Optimization
- 4. Connection Pooling
- 5. Replication
- 6. Backup and Recovery
- 7. Partitioning
- 8. Transactions and Isolation Levels
- 9. Stored Procedures
- 10. JSON Columns, Window Functions, and CTEs
- 11. TypeORM Integration with Node.js
- 12. Prisma ORM
- 13. Real-World Experience
- 14. MySQL Version Strategy and Modern Features
1. Schema Design and Normalization
Good schema design is the foundation of database performance. A poorly designed schema cannot be fixed by adding indexes or tuning queries. Start with proper normalization and denormalize only when you have measured evidence that it is needed.
- 1NF: Every column holds atomic values. No repeating groups. Every row is unique (has a primary key)
- 2NF: No partial dependencies. Every non-key column depends on the entire primary key, not just part of it
- 3NF: No transitive dependencies. Non-key columns depend only on the primary key, not on other non-key columns
- Data types matter: Use the smallest type that fits. INT (4 bytes) vs BIGINT (8 bytes). VARCHAR(255) vs VARCHAR(50). TIMESTAMP (4 bytes, UTC-converted) vs DATETIME (5 bytes in 5.6+, no timezone conversion)
- UUIDs vs auto-increment: Auto-increment INTs are faster for indexing (sequential inserts). UUIDs cause random B-tree page splits. If you need UUIDs, use ordered UUIDs (uuid_to_bin with swap flag)
- Soft deletes: Add
deleted_at DATETIME NULLinstead of hard deletes. Always includeWHERE deleted_at IS NULLin queries. Consider partial indexes
-- Well-designed schema example
CREATE TABLE users (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
email VARCHAR(255) NOT NULL,
name VARCHAR(100) NOT NULL,
status ENUM('active','inactive','suspended') NOT NULL DEFAULT 'active',
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
deleted_at DATETIME NULL,
UNIQUE KEY uk_email (email),
KEY idx_status_created (status, created_at)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE TABLE bookings (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
user_id INT UNSIGNED NOT NULL,
class_id INT UNSIGNED NOT NULL,
booked_at DATETIME NOT NULL,
status ENUM('confirmed','cancelled','completed') NOT NULL DEFAULT 'confirmed',
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
KEY idx_user_status (user_id, status),
KEY idx_class_booked (class_id, booked_at),
CONSTRAINT fk_bookings_user FOREIGN KEY (user_id) REFERENCES users(id),
CONSTRAINT fk_bookings_class FOREIGN KEY (class_id) REFERENCES classes(id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
2. Indexing Strategies
B-tree Indexes
B-tree is MySQL's default index type. It maintains sorted order and supports range queries, ordering, and grouping. Understanding how B-tree indexes work is critical for writing fast queries.
- Leftmost prefix rule: A composite index on (a, b, c) can be used for queries filtering on (a), (a, b), or (a, b, c) -- but NOT (b), (c), or (b, c)
- Column order matters: Put the most selective column first (highest cardinality). Exception: if one column is always used in equality and another in range, put equality first
- Covering indexes: If an index contains all columns needed by a query, MySQL reads only the index (no table lookup). EXPLAIN shows "Using index" in Extra column
- Index selectivity: ratio of distinct values to total rows. A boolean column (selectivity ~0.5) is a poor index candidate alone. Combine with other columns
- Index overhead: Every index slows down INSERT, UPDATE, DELETE operations. Write-heavy tables should have minimal indexes. Read-heavy tables benefit from more indexes
-- Composite index for a common query pattern
-- Query: SELECT * FROM bookings WHERE user_id = ? AND status = 'confirmed' ORDER BY booked_at DESC
-- Optimal index:
CREATE INDEX idx_user_status_booked ON bookings (user_id, status, booked_at);
-- The index covers: WHERE (equality on user_id, status) + ORDER BY (booked_at)
-- MySQL can satisfy the entire query from this single index scan
-- Covering index example
-- Query: SELECT id, email, name FROM users WHERE status = 'active' ORDER BY created_at DESC LIMIT 20
CREATE INDEX idx_status_created_covering ON users (status, created_at, id, email, name);
-- All selected columns are in the index = no table lookup needed
Hash Indexes
Hash indexes use a hash function for O(1) equality lookups. In InnoDB, the adaptive hash index (AHI) is built automatically on frequently accessed B-tree pages. MEMORY tables support explicit hash indexes.
- Equality only: Hash indexes support
=andIN()but not range queries (<,>,BETWEEN), ordering, or partial matches - Adaptive Hash Index (AHI): InnoDB automatically builds hash indexes on hot B-tree leaf pages. Monitor with
SHOW ENGINE INNODB STATUS. Disable withinnodb_adaptive_hash_index=OFFif contention is high - MEMORY engine: Explicit hash indexes with
USING HASH. Useful for lookup tables that fit entirely in memory. Data is lost on restart
-- MEMORY table with hash index (useful for session/cache lookup)
CREATE TABLE session_cache (
session_id VARCHAR(64) NOT NULL,
user_id INT UNSIGNED NOT NULL,
data TEXT,
PRIMARY KEY (session_id) USING HASH
) ENGINE=MEMORY;
-- Check adaptive hash index status
SHOW ENGINE INNODB STATUS\G
-- Look for "Hash table size", "hash searches/s", "non-hash searches/s"
Partial and Prefix Indexes
Partial indexes (functional indexes in MySQL 8.0+) index only a subset of rows or a computed expression. Prefix indexes index only the first N characters of a string column, reducing index size.
- Functional indexes (8.0+): Index an expression like
((CAST(metadata->>'$.type' AS CHAR(20)))). Useful for JSON fields or computed values - Prefix indexes:
CREATE INDEX idx_email ON users (email(20))indexes only the first 20 characters. Smaller index but cannot be used as a covering index and may have lower selectivity - Simulated partial indexes: MySQL does not natively support WHERE clause in CREATE INDEX. Simulate with a generated column:
ALTER TABLE orders ADD is_pending TINYINT GENERATED ALWAYS AS (IF(status='pending',1,NULL)) STORED, ADD INDEX idx_pending (is_pending)
-- Prefix index for long string columns
CREATE INDEX idx_url_prefix ON pages (url(50));
-- Functional index on JSON field (MySQL 8.0+)
ALTER TABLE events ADD INDEX idx_event_type ((CAST(metadata->>'$.event_type' AS CHAR(30))));
-- Simulated partial index for soft deletes
ALTER TABLE users
ADD active_flag TINYINT GENERATED ALWAYS AS (IF(deleted_at IS NULL, 1, NULL)) STORED,
ADD INDEX idx_active_users (active_flag, created_at);
Common Indexing Pitfalls
- Functions on indexed columns:
WHERE YEAR(created_at) = 2026cannot use an index on created_at. Rewrite as:WHERE created_at >= '2026-01-01' AND created_at < '2027-01-01' - Implicit type casting:
WHERE phone = 12345when phone is VARCHAR forces a full table scan. Always match types - LIKE with leading wildcard:
WHERE name LIKE '%smith'cannot use an index. OnlyLIKE 'smith%'uses the index - OR conditions:
WHERE a = 1 OR b = 2often cannot use a composite index on (a, b). Use UNION instead, or create separate indexes - Too many indexes: Each index is a separate B-tree that must be maintained on every write. More than 5-6 indexes on a table is a code smell
3. Query Optimization
EXPLAIN and Slow Query Log
EXPLAIN is the most important tool for MySQL performance. It shows how MySQL will execute a query: which indexes it uses, how many rows it examines, and the join strategy.
- type column: const (PK lookup) > eq_ref (unique join) > ref (non-unique index) > range (index range) > index (full index scan) > ALL (full table scan). Anything worse than "range" is suspect
- rows column: Estimated number of rows MySQL will examine. Compare with actual result set. Large ratio = inefficient query
- Extra column: "Using index" = covering index (good). "Using filesort" = sorting without index (often bad). "Using temporary" = temp table created (bad for large sets)
- EXPLAIN ANALYZE (8.0+): Actually executes the query and shows real timing per step. Best for comparing query plans
- Slow query log: Enable with
slow_query_log=1, setlong_query_time=0.5(500ms). Usept-query-digest(Percona Toolkit) to analyze patterns
-- EXPLAIN example
EXPLAIN SELECT b.id, b.booked_at, u.name, u.email
FROM bookings b
JOIN users u ON u.id = b.user_id
WHERE b.class_id = 42
AND b.status = 'confirmed'
AND b.booked_at >= '2026-03-01'
ORDER BY b.booked_at DESC
LIMIT 20;
-- Ideal EXPLAIN output:
-- +----+-------+------+--------------------------+------+----------+-------------+
-- | id | table | type | key | rows | filtered | Extra |
-- +----+-------+------+--------------------------+------+----------+-------------+
-- | 1 | b | ref | idx_class_status_booked | 85 | 100.00 | Using where |
-- | 1 | u | eq_ref | PRIMARY | 1 | 100.00 | NULL |
-- +----+-------+------+--------------------------+------+----------+-------------+
-- For this query, the optimal index is:
CREATE INDEX idx_class_status_booked ON bookings (class_id, status, booked_at);
Optimization Patterns
- Pagination: Avoid
OFFSET 10000(MySQL reads and discards 10000 rows). Use keyset pagination:WHERE id > last_seen_id ORDER BY id LIMIT 20 - COUNT optimization:
SELECT COUNT(*) FROM large_table WHERE status = 'active'is slow without covering index. Add index on (status) or maintain counters in a summary table - JOIN optimization: Ensure joined columns are indexed. Smaller table should be the driving table. Avoid joining more than 3-4 tables in one query
- Subquery vs JOIN: In MySQL 8.0+, the optimizer handles most subqueries well. But correlated subqueries (referencing outer query) can still be slow -- rewrite as JOINs
- SELECT only needed columns:
SELECT *fetches all columns, preventing covering index usage. List only the columns you need - Batch operations: Instead of 1000 individual INSERTs, use multi-row INSERT:
INSERT INTO t VALUES (1,'a'), (2,'b'), .... 10-50x faster
4. Connection Pooling
Each MySQL connection consumes ~10MB of memory on the server. Without connection pooling, a microservice architecture with 20 services x 10 instances = 200 connections can exhaust the database. Connection pool exhaustion is one of the most common production incidents.
- Pool size formula: Total connections = number_of_services x instances_per_service x pool_size_per_instance. Must be less than MySQL's
max_connections(default: 151) - Connection lifecycle: Idle connections are kept alive in the pool. Set
waitForConnections: trueso requests queue instead of failing when pool is full - Connection timeout: Set
connectTimeout(time to establish connection) andacquireTimeout(time to get connection from pool). Both should be short (5-10 seconds) - Idle timeout: Close connections idle longer than
wait_timeout(MySQL default: 28800s / 8 hours). Reduce to 300-600s to reclaim resources - Connection validation: Enable
SELECT 1ping before checkout to detect stale connections. Small overhead but prevents "connection lost" errors - Read/write splitting: Route reads to replica endpoint, writes to primary endpoint. Doubles effective connection capacity
// Node.js connection pool configuration (mysql2)
import mysql from 'mysql2/promise';
const writePool = mysql.createPool({
host: process.env.DB_WRITER_HOST, // Aurora cluster endpoint
port: 3306,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD,
database: 'myapp',
connectionLimit: 10, // Max connections in this pool
waitForConnections: true, // Queue when pool is full
queueLimit: 50, // Max queued requests (0 = unlimited)
connectTimeout: 5000, // 5s to establish connection
enableKeepAlive: true,
keepAliveInitialDelay: 30000 // TCP keepalive every 30s
});
const readPool = mysql.createPool({
host: process.env.DB_READER_HOST, // Aurora reader endpoint
port: 3306,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD,
database: 'myapp',
connectionLimit: 20, // More connections for reads
waitForConnections: true,
connectTimeout: 5000,
enableKeepAlive: true
});
// Monitor pool health
setInterval(() => {
const stats = writePool.pool;
console.log({
active: stats._allConnections.length,
idle: stats._freeConnections.length,
queued: stats._connectionQueue.length
});
}, 30000);
5. Replication
Replication copies data from a primary (master) to one or more replicas (slaves). It enables read scaling, high availability, and geographic distribution.
- Asynchronous replication: Default MySQL mode. Primary does not wait for replicas to acknowledge. Risk: data loss if primary fails before replica catches up
- Semi-synchronous replication: Primary waits for at least one replica to acknowledge before committing. Reduces data loss risk. Adds ~1ms latency per commit
- GTID replication (MySQL 5.6+): Global Transaction Identifiers make failover simpler. Each transaction has a unique ID. Replicas can auto-position without specifying binlog file/position
- Row-based vs statement-based: Row-based (ROW) replicates actual data changes -- safer and more predictable. Statement-based (STATEMENT) replicates SQL statements -- less bandwidth but can produce different results on replicas
- Replication lag: Monitor with
SHOW SLAVE STATUS\G(Seconds_Behind_Master). Read-after-write consistency requires reading from primary if lag is unacceptable - Aurora replication: Shared storage layer eliminates replication lag issues. All replicas read from the same storage volume. Failover in <30 seconds without data loss
- Group replication (MySQL 5.7.17+): Multi-primary or single-primary mode. All members replicate via a group communication protocol (Paxos-based). Automatic conflict detection and resolution. Foundation for MySQL InnoDB Cluster
- Aurora Global Database: Cross-region replication with <1 second lag. Dedicated infrastructure for disaster recovery. Managed failover promotes a secondary region to primary in under 1 minute
6. Backup and Recovery
- Logical backups (mysqldump): Exports SQL statements. Portable across versions. Slow for large databases (>10GB). Use
--single-transactionfor InnoDB (consistent snapshot without locking) - Physical backups (xtrabackup): Copies data files directly. Fast for large databases. Use Percona XtraBackup for hot backups without locking. Supports incremental backups
- Aurora backups: Continuous backup to S3. Automated with 1-35 day retention. Point-in-time recovery (PITR) to any second within retention window. Backtrack for instant rewind without restore
- Backup validation: Regularly restore backups to a test environment. An untested backup is not a backup. Automate this with a weekly cron job
- Binary log retention: Keep binlogs for at least 7 days for point-in-time recovery. Set
binlog_expire_logs_seconds=604800 - Cross-region backups: For disaster recovery, copy snapshots to another region. Aurora supports cross-region snapshot copy and Global Database
# mysqldump with best practices
mysqldump \
--single-transaction \ # Consistent snapshot (InnoDB only)
--routines \ # Include stored procedures
--triggers \ # Include triggers
--events \ # Include scheduled events
--set-gtid-purged=OFF \ # Avoid GTID issues on import
--max-allowed-packet=256M \
--databases myapp \
| gzip > myapp_$(date +%Y%m%d_%H%M%S).sql.gz
# Percona XtraBackup (full + incremental)
xtrabackup --backup --target-dir=/backup/full --user=backup --password=$PW
xtrabackup --backup --target-dir=/backup/inc1 --incremental-basedir=/backup/full
7. Partitioning
Partitioning splits a large table into smaller physical pieces while presenting it as a single logical table. It improves query performance on very large tables by enabling partition pruning -- MySQL only scans the relevant partitions.
- RANGE partitioning: Rows are assigned to partitions based on a column value falling within a range. Ideal for time-series data: partition by month or year. Drop old partitions instantly instead of DELETE
- LIST partitioning: Rows are assigned based on column values matching a list. Good for categorical data like country or status codes
- HASH partitioning: Distributes rows evenly across N partitions using a modulus function. Useful when there is no natural range or list boundary
- KEY partitioning: Like HASH but MySQL manages the hash function internally. Works with any column type including strings
- Partition pruning: Queries that include the partition key in the WHERE clause only scan matching partitions. Always include the partition key in queries for maximum benefit
- Limitations: Foreign keys are not supported on partitioned tables. All unique keys (including primary) must include the partition key. Maximum 8192 partitions per table
-- RANGE partitioning by month (ideal for booking/event data)
CREATE TABLE booking_logs (
id BIGINT UNSIGNED AUTO_INCREMENT,
user_id INT UNSIGNED NOT NULL,
action VARCHAR(50) NOT NULL,
created_at DATETIME NOT NULL,
PRIMARY KEY (id, created_at),
KEY idx_user (user_id, created_at)
) PARTITION BY RANGE (YEAR(created_at) * 100 + MONTH(created_at)) (
PARTITION p202601 VALUES LESS THAN (202602),
PARTITION p202602 VALUES LESS THAN (202603),
PARTITION p202603 VALUES LESS THAN (202604),
PARTITION p_future VALUES LESS THAN MAXVALUE
);
-- Drop old partition (instant, no row-by-row DELETE)
ALTER TABLE booking_logs DROP PARTITION p202601;
-- Add new partition for next month
ALTER TABLE booking_logs REORGANIZE PARTITION p_future INTO (
PARTITION p202604 VALUES LESS THAN (202605),
PARTITION p_future VALUES LESS THAN MAXVALUE
);
-- Verify partition pruning with EXPLAIN
EXPLAIN SELECT * FROM booking_logs
WHERE created_at >= '2026-03-01' AND created_at < '2026-04-01';
-- Should show "partitions: p202603" (only one partition scanned)
8. Transactions and Isolation Levels
InnoDB supports full ACID transactions. The isolation level determines how concurrent transactions see each other's changes, with direct impact on performance and data consistency.
- READ UNCOMMITTED: Transactions can see uncommitted changes from other transactions (dirty reads). Almost never appropriate in production. Highest concurrency, lowest consistency
- READ COMMITTED: Each SELECT sees only committed data as of the moment the SELECT executes. No dirty reads, but non-repeatable reads are possible. Used by PostgreSQL default and Oracle
- REPEATABLE READ (MySQL default): All SELECTs within a transaction see the snapshot taken at the first read. No dirty reads, no non-repeatable reads. Phantom reads are prevented by gap locks in InnoDB
- SERIALIZABLE: Fully serialized execution. All SELECTs are implicitly
SELECT ... LOCK IN SHARE MODE. Highest consistency but lowest concurrency. Frequent deadlocks under load - Gap locks (InnoDB): In REPEATABLE READ, InnoDB locks gaps between index records to prevent phantom reads. This can cause unexpected deadlocks on range queries. Switch to READ COMMITTED if gap lock contention is high
- Deadlock handling: InnoDB automatically detects deadlocks and rolls back the smaller transaction. Always handle deadlock errors (error 1213) in application code with a retry loop (3 attempts with backoff)
-- Check current isolation level
SELECT @@transaction_isolation;
-- Set session isolation level
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
-- Locking reads for inventory/booking systems
START TRANSACTION;
-- Lock the row to prevent concurrent overbooking
SELECT capacity, current_bookings
FROM classes WHERE id = 42 FOR UPDATE;
-- Application checks: if current_bookings < capacity, proceed
UPDATE classes SET current_bookings = current_bookings + 1 WHERE id = 42;
INSERT INTO bookings (user_id, class_id, status) VALUES (7, 42, 'confirmed');
COMMIT;
-- Deadlock retry pattern (application-side pseudocode)
-- for attempt in range(3):
-- try: execute_transaction(); break
-- catch DeadlockError: sleep(0.1 * 2^attempt); continue
9. Stored Procedures
Stored procedures execute SQL logic on the database server, reducing network round trips. They are useful for complex multi-step operations, data migrations, and enforcing business rules at the database level.
- When to use: Multi-step operations that benefit from reduced round trips (batch processing, data cleanup). Enforcing invariants that must hold regardless of which application writes data
- When to avoid: Business logic that changes frequently (harder to version/deploy than application code). Complex computations better suited for application-layer processing
- Error handling: Use
DECLARE HANDLERfor exception handling. Always handleSQLEXCEPTIONto roll back partial work. UseSIGNALto raise custom errors - Security:
DEFINERvsINVOKERsecurity context. DEFINER runs with the privileges of the user who created it. Use INVOKER when callers should use their own privileges - Performance: Stored procedures are compiled and cached in the query cache. First execution may be slower; subsequent calls reuse the cached plan
DELIMITER //
-- Stored procedure for batch archiving old bookings
CREATE PROCEDURE archive_old_bookings(IN cutoff_date DATE, OUT archived_count INT)
SQL SECURITY DEFINER
BEGIN
DECLARE EXIT HANDLER FOR SQLEXCEPTION
BEGIN
ROLLBACK;
RESIGNAL;
END;
START TRANSACTION;
-- Move old completed bookings to archive table
INSERT INTO bookings_archive
SELECT * FROM bookings
WHERE status = 'completed' AND booked_at < cutoff_date;
SET archived_count = ROW_COUNT();
DELETE FROM bookings
WHERE status = 'completed' AND booked_at < cutoff_date;
COMMIT;
END //
DELIMITER ;
-- Call the procedure
CALL archive_old_bookings('2025-01-01', @count);
SELECT @count AS archived_rows;
10. JSON Columns, Window Functions, and CTEs
JSON Columns
MySQL 5.7+ supports native JSON columns with validation, indexing, and a rich set of functions. JSON columns are ideal for semi-structured data that varies per row -- metadata, preferences, feature flags -- without requiring schema changes.
- JSON vs TEXT: JSON columns are validated on INSERT/UPDATE (rejects malformed JSON). Stored in an optimized binary format for faster access. TEXT has no validation and requires parsing on every read
- Accessing values:
->returns JSON,->>returns unquoted text. Example:metadata->>'$.plan'returns the plan value as a string - Indexing JSON: Create a generated column and index it, or use functional indexes (8.0+):
ALTER TABLE t ADD INDEX ((CAST(data->>'$.type' AS CHAR(30)))) - Useful functions:
JSON_EXTRACT,JSON_SET,JSON_REMOVE,JSON_ARRAYAGG,JSON_OBJECTAGG,JSON_TABLE(8.0+ -- converts JSON array to rows)
-- Table with JSON column for flexible metadata
CREATE TABLE users (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
email VARCHAR(255) NOT NULL UNIQUE,
metadata JSON DEFAULT NULL,
KEY idx_plan ((CAST(metadata->>'$.plan' AS CHAR(20))))
);
-- Insert with JSON
INSERT INTO users (email, metadata) VALUES
('user@example.com', '{"plan": "pro", "features": ["analytics", "exports"], "trial_ends": "2026-04-01"}');
-- Query JSON fields
SELECT email, metadata->>'$.plan' AS plan
FROM users WHERE metadata->>'$.plan' = 'pro';
-- Update nested JSON value
UPDATE users SET metadata = JSON_SET(metadata, '$.plan', 'enterprise')
WHERE id = 1;
-- JSON_TABLE: expand JSON array into rows (8.0+)
SELECT u.email, f.feature
FROM users u,
JSON_TABLE(u.metadata, '$.features[*]' COLUMNS (feature VARCHAR(50) PATH '$')) AS f;
Window Functions (MySQL 8.0+)
Window functions perform calculations across a set of rows related to the current row, without collapsing them into a single output row like GROUP BY. Essential for rankings, running totals, and comparative analytics.
- ROW_NUMBER(): Assigns a unique sequential number. Useful for pagination, deduplication, and top-N per group queries
- RANK() / DENSE_RANK(): Ranking with ties. RANK skips numbers after ties; DENSE_RANK does not
- LAG() / LEAD(): Access values from previous or next rows. Useful for period-over-period comparisons
- SUM() OVER / AVG() OVER: Running totals and moving averages without self-joins
-- Top 3 most active users per gym (by booking count)
SELECT * FROM (
SELECT
u.name,
g.gym_name,
COUNT(b.id) AS booking_count,
ROW_NUMBER() OVER (PARTITION BY g.id ORDER BY COUNT(b.id) DESC) AS rn
FROM bookings b
JOIN users u ON u.id = b.user_id
JOIN classes c ON c.id = b.class_id
JOIN gyms g ON g.id = c.gym_id
WHERE b.booked_at >= '2026-01-01'
GROUP BY u.id, g.id
) ranked WHERE rn <= 3;
-- Month-over-month booking growth using LAG
SELECT
DATE_FORMAT(booked_at, '%Y-%m') AS month,
COUNT(*) AS bookings,
LAG(COUNT(*)) OVER (ORDER BY DATE_FORMAT(booked_at, '%Y-%m')) AS prev_month,
ROUND((COUNT(*) - LAG(COUNT(*)) OVER (ORDER BY DATE_FORMAT(booked_at, '%Y-%m')))
/ LAG(COUNT(*)) OVER (ORDER BY DATE_FORMAT(booked_at, '%Y-%m')) * 100, 1) AS growth_pct
FROM bookings
GROUP BY DATE_FORMAT(booked_at, '%Y-%m')
ORDER BY month;
Common Table Expressions (CTEs)
CTEs (WITH clause, MySQL 8.0+) define temporary named result sets that exist for the duration of a single query. They improve readability, enable recursive queries, and can replace complex subqueries and temporary tables.
- Non-recursive CTEs: Named subqueries that make complex queries readable. MySQL may materialize or merge them depending on the optimizer
- Recursive CTEs: Self-referencing CTEs for hierarchical data (org charts, category trees, graph traversal). Use
WITH RECURSIVE - Performance: In MySQL 8.0, CTEs referenced multiple times may be materialized (computed once). Use EXPLAIN to verify. For single-use CTEs, the optimizer typically merges them into the main query
-- CTE for readable multi-step reporting
WITH monthly_bookings AS (
SELECT
DATE_FORMAT(booked_at, '%Y-%m') AS month,
class_id,
COUNT(*) AS total
FROM bookings
WHERE status = 'completed'
GROUP BY month, class_id
),
class_rankings AS (
SELECT
month, class_id, total,
RANK() OVER (PARTITION BY month ORDER BY total DESC) AS rnk
FROM monthly_bookings
)
SELECT month, class_id, total
FROM class_rankings WHERE rnk <= 5;
-- Recursive CTE: category tree traversal
WITH RECURSIVE category_tree AS (
SELECT id, name, parent_id, 0 AS depth
FROM categories WHERE parent_id IS NULL
UNION ALL
SELECT c.id, c.name, c.parent_id, ct.depth + 1
FROM categories c
JOIN category_tree ct ON c.parent_id = ct.id
)
SELECT CONCAT(REPEAT(' ', depth), name) AS tree_view
FROM category_tree ORDER BY depth, name;
11. TypeORM Integration with Node.js
Configuration and Entity Design
TypeORM is the most popular TypeScript ORM for Node.js. It supports Active Record and Data Mapper patterns, migrations, and query builder. Proper configuration is essential for production performance.
// TypeORM data source configuration for Aurora MySQL
import { DataSource } from 'typeorm';
export const AppDataSource = new DataSource({
type: 'mysql',
replication: {
master: {
host: process.env.DB_WRITER_HOST,
port: 3306,
username: process.env.DB_USER,
password: process.env.DB_PASSWORD,
database: 'myapp'
},
slaves: [{
host: process.env.DB_READER_HOST,
port: 3306,
username: process.env.DB_USER,
password: process.env.DB_PASSWORD,
database: 'myapp'
}]
},
entities: ['dist/entities/**/*.js'],
migrations: ['dist/migrations/**/*.js'],
logging: ['error', 'warn', 'migration'], // Never log 'query' in production
maxQueryExecutionTime: 1000, // Log queries taking >1s
extra: {
connectionLimit: 10,
waitForConnections: true,
connectTimeout: 5000
}
});
// Entity example with proper decorators
@Entity('bookings')
@Index('idx_user_status', ['userId', 'status'])
export class Booking {
@PrimaryGeneratedColumn('increment')
id: number;
@Column({ type: 'int', unsigned: true })
userId: number;
@Column({ type: 'enum', enum: BookingStatus, default: BookingStatus.CONFIRMED })
status: BookingStatus;
@Column({ type: 'datetime' })
bookedAt: Date;
@CreateDateColumn()
createdAt: Date;
@ManyToOne(() => User, user => user.bookings)
@JoinColumn({ name: 'user_id' })
user: User;
}
Production Patterns
- Query builder over find: Use QueryBuilder for complex queries. The
find()API generates inefficient SQL for joins and conditions - Transactions: Always use
queryRunnerfor transactions. Release the runner in a finally block to prevent connection leaks - Migrations: Never use
synchronize: truein production. Generate migrations withtypeorm migration:generateand review SQL before running - N+1 queries: Use
leftJoinAndSelectorrelationsoption to eager-load. Never access lazy relations in a loop - Raw queries: For complex reporting or bulk operations, drop down to raw SQL via
query()method. TypeORM's abstractions are not always optimal
// Transaction pattern with proper cleanup
async function createBooking(userId: number, classId: number): Promise<Booking> {
const queryRunner = AppDataSource.createQueryRunner();
await queryRunner.connect();
await queryRunner.startTransaction();
try {
// Check class capacity (SELECT ... FOR UPDATE to lock the row)
const classEntity = await queryRunner.manager
.createQueryBuilder(GymClass, 'c')
.setLock('pessimistic_write')
.where('c.id = :classId', { classId })
.getOne();
if (!classEntity || classEntity.currentBookings >= classEntity.capacity) {
throw new Error('Class is full');
}
// Create booking
const booking = queryRunner.manager.create(Booking, {
userId, classId: classEntity.id, bookedAt: new Date(), status: BookingStatus.CONFIRMED
});
await queryRunner.manager.save(booking);
// Increment counter
await queryRunner.manager.increment(GymClass, { id: classId }, 'currentBookings', 1);
await queryRunner.commitTransaction();
return booking;
} catch (err) {
await queryRunner.rollbackTransaction();
throw err;
} finally {
await queryRunner.release(); // ALWAYS release the connection
}
}
12. Prisma ORM
Prisma is a modern TypeScript ORM with a declarative schema, auto-generated type-safe client, and built-in migration system. It takes a schema-first approach that differs fundamentally from TypeORM's decorator-based model.
- Prisma schema: Single
schema.prismafile defines models, relations, and datasource. The generated client provides full type safety -- impossible to query a column that does not exist - Prisma Migrate: Schema-driven migrations. Change the schema file, run
prisma migrate dev, and Prisma generates the SQL migration. Review before applying to production withprisma migrate deploy - Connection pooling: Prisma uses its own connection pool. Configure
connection_limitin the datasource URL. For serverless environments, use Prisma Accelerate or PgBouncer-style external pooling - Transactions: Interactive transactions with
prisma.$transaction(async (tx) => { ... }). Supports timeout and isolation level configuration. Automatic rollback on error - Raw queries:
prisma.$queryRawfor typed raw SQL andprisma.$executeRawfor mutations. Use tagged template literals for safe parameterization - TypeORM vs Prisma: TypeORM offers more flexibility (query builder, Active Record). Prisma offers better type safety and developer experience. Choose Prisma for new projects; TypeORM when you need complex dynamic queries
// schema.prisma
datasource db {
provider = "mysql"
url = env("DATABASE_URL") // mysql://user:pass@host:3306/myapp?connection_limit=10
}
generator client {
provider = "prisma-client-js"
}
model User {
id Int @id @default(autoincrement())
email String @unique @db.VarChar(255)
name String @db.VarChar(100)
status UserStatus @default(ACTIVE)
bookings Booking[]
createdAt DateTime @default(now()) @map("created_at")
deletedAt DateTime? @map("deleted_at")
@@index([status, createdAt])
@@map("users")
}
model Booking {
id Int @id @default(autoincrement())
userId Int @map("user_id")
classId Int @map("class_id")
bookedAt DateTime @map("booked_at")
status BookingStatus @default(CONFIRMED)
user User @relation(fields: [userId], references: [id])
@@index([userId, status])
@@index([classId, bookedAt])
@@map("bookings")
}
enum UserStatus { ACTIVE INACTIVE SUSPENDED }
enum BookingStatus { CONFIRMED CANCELLED COMPLETED }
// Prisma Client usage with type-safe queries
import { PrismaClient } from '@prisma/client';
const prisma = new PrismaClient();
// Type-safe query -- TypeScript errors if column doesn't exist
const activeUsers = await prisma.user.findMany({
where: { status: 'ACTIVE', deletedAt: null },
include: { bookings: { where: { status: 'CONFIRMED' } } },
orderBy: { createdAt: 'desc' },
take: 20
});
// Interactive transaction with timeout
const booking = await prisma.$transaction(async (tx) => {
const gymClass = await tx.gymClass.findUniqueOrThrow({
where: { id: classId }
});
if (gymClass.currentBookings >= gymClass.capacity) {
throw new Error('Class is full');
}
const newBooking = await tx.booking.create({
data: { userId, classId, bookedAt: new Date(), status: 'CONFIRMED' }
});
await tx.gymClass.update({
where: { id: classId },
data: { currentBookings: { increment: 1 } }
});
return newBooking;
}, { timeout: 5000, isolationLevel: 'Serializable' });
13. Real-World Experience
In production, I managed Aurora RDS MySQL as the primary database for all services, handling millions of bookings, payments, and user records across multiple Latin American countries.
- Aurora RDS MySQL: Single Aurora cluster with writer + reader endpoints. Serverless v2 for cost-efficient scaling during off-peak hours. 14-day automated backup retention with point-in-time recovery
- Connection pool debugging: Identified and fixed pool exhaustion caused by unreleased TypeORM queryRunners in error paths. Reduced connection-related errors by 95% by adding proper finally blocks and pool monitoring
- Query optimization: Used Performance Insights and slow query log to identify top 20 slowest queries. Added composite indexes and rewrote N+1 patterns, reducing P95 API latency from 800ms to 120ms
- TypeORM migrations: Established migration discipline: generated migrations, code review of SQL, staged rollout (dev > staging > production). Zero data-loss incidents from schema changes
- Read/write splitting: Configured TypeORM replication with writer and reader endpoints. Moved all report queries to reader endpoint, reducing primary CPU by ~30%
14. MySQL Version Strategy and Modern Features
MySQL Version Strategy
MySQL now follows a dual-track release model. MySQL 8.4 is the Long-Term Support (LTS) release with extended support and stability guarantees. MySQL 9.x is the Innovation track with a 3-month release cadence, introducing new features that eventually land in the next LTS. MySQL 8.0 reaches End of Life in April 2026 — plan your migration now.
- MySQL 8.4 LTS: Recommended for production workloads. Receives bug fixes and security patches for the full support lifecycle. Upgrade path from 8.0 is straightforward with
mysql_upgradeor in-place upgrade - MySQL 9.x Innovation: 3-month release cadence. New features like VECTOR type, JavaScript stored programs, and enhanced optimizer. Upgrade required to each new Innovation release. Best for development, testing, and non-critical workloads
- MySQL 8.0 EOL (April 2026): No more bug fixes or security patches after April 2026. Remaining on 8.0 after EOL exposes your database to unpatched vulnerabilities. Migrate to 8.4 LTS for stability or 9.x for latest features
mysqlcheck --check-upgrade, review deprecated features you depend on, and validate your ORM compatibility. Aurora MySQL 8.0-compatible will also require migration to Aurora MySQL 8.4-compatible.VECTOR Data Type (MySQL 9.0+)
MySQL 9.0 introduces a native VECTOR data type for storing AI/ML embeddings. Vectors are stored as arrays of 4-byte floats, optimized for similarity search operations. This is essential for building semantic search, recommendation engines, and Retrieval-Augmented Generation (RAG) pipelines directly in MySQL without requiring a separate vector database.
-- VECTOR data type (MySQL 9.0+)
CREATE TABLE documents (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(500) NOT NULL,
content TEXT NOT NULL,
embedding VECTOR(1536) NOT NULL, -- OpenAI ada-002 dimension
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
INDEX idx_created (created_at)
);
-- Insert with vector literal
INSERT INTO documents (title, content, embedding)
VALUES ('Node.js Guide', 'Event loop internals...', TO_VECTOR('[0.0123, -0.0456, ...]'));
-- Cosine similarity search for RAG
SELECT id, title,
1 - DISTANCE(embedding, TO_VECTOR(?), 'COSINE') AS similarity
FROM documents
ORDER BY DISTANCE(embedding, TO_VECTOR(?), 'COSINE')
LIMIT 10;
JavaScript Stored Programs (MySQL 9.0+, Enterprise)
MySQL 9.0 Enterprise introduces the ability to write stored procedures and functions in JavaScript using a GraalVM-based ES2023 runtime. This allows developers to use familiar JavaScript syntax, modern language features, and npm-style module patterns for server-side database logic, while maintaining the performance benefits of executing code close to the data.
-- JavaScript stored function (MySQL 9.0+ Enterprise)
CREATE FUNCTION calculate_bmi(weight_kg DOUBLE, height_m DOUBLE)
RETURNS DOUBLE
LANGUAGE JAVASCRIPT AS $$
if (height_m <= 0) throw new Error('Height must be positive');
const bmi = weight_kg / (height_m * height_m);
return Math.round(bmi * 10) / 10;
$$;
-- Use like any SQL function
SELECT name, calculate_bmi(weight_kg, height_m) AS bmi
FROM users
WHERE calculate_bmi(weight_kg, height_m) > 25;
MySQL 9.7.0 LTS and HeatWave AI (April 2026)
MySQL 9.7.0 LTS was released on April 21, 2026, making it the new Long-Term Support release on the 9.x track. Key additions include time-based audit log rotation (rotate logs by interval rather than file size), the replica_allow_higher_version_source variable that enables replicas to accept replication from higher-version sources during rolling upgrades, and expanded window function optimizations. MySQL 9.6.0 (January 2026) introduced the modular Audit Log component that splits the monolithic audit log into smaller, more manageable pieces. MySQL 8.0 has now reached End of Life as of April 2026 — all 8.0 users must migrate to 8.4 LTS or the 9.x track.
HeatWave AI integration in MySQL 9.4+ adds Natural Language to Machine Learning (NL2ML), enabling an intuitive natural-language interface to ML workflows. Enhanced AutoML memory management automatically checks usage and releases resources based on severity levels. AutoML now also records hyperparameters of trained models, improving transparency and reproducibility. These features are available in HeatWave on OCI, including the Always Free tier.