Database Performance Optimization: From Slow Queries to Sub-millisecond Response Times

August 10, 2025 • 15 min read • Yen

AI database performance optimization postgresql redis

Database performance is often the bottleneck in web applications. Over the past year, we’ve transformed our database layer from a source of constant performance issues to a highly optimized system that consistently delivers sub-millisecond response times.

The Performance Crisis

Our journey began with a crisis. Our main application database was experiencing:

Query timeouts: 15% of queries taking over 30 seconds
Connection pool exhaustion: Regular 503 errors during peak traffic
Cascading failures: Slow queries blocking other operations
Poor user experience: Page load times exceeding 5 seconds

The root cause analysis revealed multiple issues across our database architecture.

Systematic Performance Analysis

1. Query Analysis and Profiling

The first step was understanding where time was being spent:

 1-- Enable query logging for analysis
 2ALTER SYSTEM SET log_statement = 'all';
 3ALTER SYSTEM SET log_duration = on;
 4ALTER SYSTEM SET log_min_duration_statement = 1000; -- Log queries > 1s
 5
 6-- Analyze slow queries
 7SELECT 
 8    query,
 9    calls,
10    total_time,
11    mean_time,
12    rows
13FROM pg_stat_statements
14ORDER BY mean_time DESC
15LIMIT 10;

This revealed our top performance killers:

1-- Problematic query #1: N+1 query pattern
2SELECT * FROM users WHERE id = $1; -- Called 1000+ times per request
3
4-- Problematic query #2: Missing index
5SELECT * FROM orders 
6WHERE created_at BETWEEN $1 AND $2 
7AND status = 'pending'
8ORDER BY created_at DESC; -- 45 second execution time

2. Index Strategy Overhaul

Strategic indexing provided immediate improvements:

 1-- Composite index for filtering and sorting
 2CREATE INDEX CONCURRENTLY idx_orders_status_created 
 3ON orders (status, created_at DESC) 
 4WHERE status IN ('pending', 'processing');
 5
 6-- Partial index for active users
 7CREATE INDEX CONCURRENTLY idx_users_active_email 
 8ON users (email) 
 9WHERE deleted_at IS NULL;
10
11-- Expression index for case-insensitive searches
12CREATE INDEX CONCURRENTLY idx_users_lower_email 
13ON users (LOWER(email));

Results:

Query time reduced from 45s to 12ms (99.97% improvement)
Index size kept minimal with partial indexes
No impact on write performance

3. Query Optimization Techniques

Eliminating N+1 Queries

 1// Before: N+1 queries
 2func GetUsersWithOrders(userIDs []string) ([]*User, error) {
 3    users := make([]*User, 0, len(userIDs))
 4    
 5    for _, id := range userIDs {
 6        user, err := db.GetUser(id) // 1 query per user
 7        if err != nil {
 8            return nil, err
 9        }
10        
11        orders, err := db.GetOrdersByUserID(id) // 1 query per user
12        if err != nil {
13            return nil, err
14        }
15        
16        user.Orders = orders
17        users = append(users, user)
18    }
19    
20    return users, nil
21}
22
23// After: 2 queries total
24func GetUsersWithOrders(userIDs []string) ([]*User, error) {
25    // Single query to get all users
26    users, err := db.GetUsersByIDs(userIDs)
27    if err != nil {
28        return nil, err
29    }
30    
31    // Single query to get all orders
32    orders, err := db.GetOrdersByUserIDs(userIDs)
33    if err != nil {
34        return nil, err
35    }
36    
37    // Group orders by user ID in memory
38    orderMap := groupOrdersByUserID(orders)
39    
40    for _, user := range users {
41        user.Orders = orderMap[user.ID]
42    }
43    
44    return users, nil
45}

Query Restructuring

 1-- Before: Inefficient subquery
 2SELECT * FROM products p
 3WHERE p.category_id IN (
 4    SELECT c.id FROM categories c 
 5    WHERE c.name LIKE '%electronics%'
 6);
 7
 8-- After: Efficient join
 9SELECT p.* 
10FROM products p
11INNER JOIN categories c ON p.category_id = c.id
12WHERE c.name LIKE '%electronics%';

Caching Architecture

Multi-Level Caching Strategy

 1type CacheHierarchy struct {
 2    l1 *LocalCache    // Application-level cache
 3    l2 *RedisCache    // Distributed cache
 4    l3 *Database      // Source of truth
 5}
 6
 7func (ch *CacheHierarchy) GetUser(id string) (*User, error) {
 8    // L1: Check local cache
 9    if user, found := ch.l1.Get("user:" + id); found {
10        return user.(*User), nil
11    }
12    
13    // L2: Check Redis
14    if userData, err := ch.l2.Get("user:" + id); err == nil {
15        user := &User{}
16        json.Unmarshal(userData, user)
17        
18        // Populate L1 cache
19        ch.l1.Set("user:"+id, user, time.Minute*5)
20        return user, nil
21    }
22    
23    // L3: Query database
24    user, err := ch.l3.GetUser(id)
25    if err != nil {
26        return nil, err
27    }
28    
29    // Populate both cache levels
30    userData, _ := json.Marshal(user)
31    ch.l2.Set("user:"+id, userData, time.Hour)
32    ch.l1.Set("user:"+id, user, time.Minute*5)
33    
34    return user, nil
35}

Cache Invalidation Strategy

 1type CacheInvalidator struct {
 2    redis   *redis.Client
 3    local   cache.Cache
 4    patterns map[string][]string
 5}
 6
 7func (ci *CacheInvalidator) InvalidateUser(userID string) error {
 8    keys := []string{
 9        "user:" + userID,
10        "user_orders:" + userID,
11        "user_preferences:" + userID,
12    }
13    
14    // Invalidate Redis cache
15    for _, key := range keys {
16        if err := ci.redis.Del(key).Err(); err != nil {
17            return err
18        }
19    }
20    
21    // Invalidate local cache
22    for _, key := range keys {
23        ci.local.Delete(key)
24    }
25    
26    return nil
27}

Connection Pool Optimization

Database Connection Management

 1type DBConfig struct {
 2    MaxOpenConns    int
 3    MaxIdleConns    int
 4    ConnMaxLifetime time.Duration
 5    ConnMaxIdleTime time.Duration
 6}
 7
 8func optimizeConnectionPool() *sql.DB {
 9    config := &DBConfig{
10        MaxOpenConns:    50,  // Based on server capacity
11        MaxIdleConns:    25,  // Half of max open
12        ConnMaxLifetime: 30 * time.Minute,
13        ConnMaxIdleTime: 10 * time.Minute,
14    }
15    
16    db.SetMaxOpenConns(config.MaxOpenConns)
17    db.SetMaxIdleConns(config.MaxIdleConns)
18    db.SetConnMaxLifetime(config.ConnMaxLifetime)
19    db.SetConnMaxIdleTime(config.ConnMaxIdleTime)
20    
21    return db
22}

Connection Pool Monitoring

 1func monitorConnectionPool(db *sql.DB) {
 2    ticker := time.NewTicker(time.Minute)
 3    defer ticker.Stop()
 4    
 5    for {
 6        select {
 7        case <-ticker.C:
 8            stats := db.Stats()
 9            
10            metrics.Gauge("db.connections.open").Set(float64(stats.OpenConnections))
11            metrics.Gauge("db.connections.idle").Set(float64(stats.Idle))
12            metrics.Gauge("db.connections.in_use").Set(float64(stats.InUse))
13            metrics.Counter("db.connections.wait_count").Add(float64(stats.WaitCount))
14            
15            if stats.WaitCount > 0 {
16                log.Warn("Database connection pool under pressure", 
17                    "wait_count", stats.WaitCount,
18                    "wait_duration", stats.WaitDuration)
19            }
20        }
21    }
22}

Read Replica Strategy

Load Balancing Reads and Writes

 1type DatabaseCluster struct {
 2    master   *sql.DB
 3    replicas []*sql.DB
 4    selector ReplicaSelector
 5}
 6
 7type QueryContext struct {
 8    Type     QueryType
 9    UserID   string
10    Priority Priority
11}
12
13func (dc *DatabaseCluster) Query(ctx QueryContext, query string, args ...interface{}) (*sql.Rows, error) {
14    switch ctx.Type {
15    case QueryTypeWrite, QueryTypeTransactional:
16        return dc.master.Query(query, args...)
17        
18    case QueryTypeRead:
19        replica := dc.selector.SelectReplica(ctx)
20        return replica.Query(query, args...)
21        
22    default:
23        return nil, errors.New("unknown query type")
24    }
25}
26
27// Weighted round-robin replica selection
28type WeightedRoundRobin struct {
29    replicas []ReplicaWithWeight
30    current  int
31    mutex    sync.Mutex
32}
33
34func (wrr *WeightedRoundRobin) SelectReplica(ctx QueryContext) *sql.DB {
35    wrr.mutex.Lock()
36    defer wrr.mutex.Unlock()
37    
38    // High priority queries go to least loaded replica
39    if ctx.Priority == PriorityHigh {
40        return wrr.selectLeastLoaded()
41    }
42    
43    // Standard round-robin for normal queries
44    replica := wrr.replicas[wrr.current%len(wrr.replicas)]
45    wrr.current++
46    return replica.DB
47}

Query Result Optimization

Pagination Best Practices

 1-- Inefficient: OFFSET becomes slow with large offsets
 2SELECT * FROM posts 
 3ORDER BY created_at DESC 
 4LIMIT 20 OFFSET 10000;
 5
 6-- Efficient: Cursor-based pagination
 7SELECT * FROM posts 
 8WHERE created_at < $1  -- cursor from previous page
 9ORDER BY created_at DESC 
10LIMIT 20;

 1type CursorPagination struct {
 2    Cursor string `json:"cursor"`
 3    Limit  int    `json:"limit"`
 4}
 5
 6func (p *PostService) GetPosts(pagination CursorPagination) (*PostsResponse, error) {
 7    var cursor time.Time
 8    var err error
 9    
10    if pagination.Cursor != "" {
11        cursor, err = time.Parse(time.RFC3339, pagination.Cursor)
12        if err != nil {
13            return nil, err
14        }
15    } else {
16        cursor = time.Now()
17    }
18    
19    query := `
20        SELECT id, title, content, created_at 
21        FROM posts 
22        WHERE created_at < $1 
23        ORDER BY created_at DESC 
24        LIMIT $2`
25    
26    rows, err := p.db.Query(query, cursor, pagination.Limit+1)
27    if err != nil {
28        return nil, err
29    }
30    defer rows.Close()
31    
32    posts := make([]*Post, 0, pagination.Limit)
33    var nextCursor string
34    
35    for rows.Next() {
36        post := &Post{}
37        if err := rows.Scan(&post.ID, &post.Title, &post.Content, &post.CreatedAt); err != nil {
38            return nil, err
39        }
40        
41        if len(posts) == pagination.Limit {
42            nextCursor = post.CreatedAt.Format(time.RFC3339)
43            break
44        }
45        
46        posts = append(posts, post)
47    }
48    
49    return &PostsResponse{
50        Posts:      posts,
51        NextCursor: nextCursor,
52    }, nil
53}

Selective Field Loading

 1// Don't load unnecessary data
 2type UserSummary struct {
 3    ID    string `json:"id"`
 4    Name  string `json:"name"`
 5    Email string `json:"email"`
 6}
 7
 8func (us *UserService) GetUserSummaries(ids []string) ([]*UserSummary, error) {
 9    query := `
10        SELECT id, name, email 
11        FROM users 
12        WHERE id = ANY($1) 
13        AND deleted_at IS NULL`
14    
15    rows, err := us.db.Query(query, pq.Array(ids))
16    if err != nil {
17        return nil, err
18    }
19    defer rows.Close()
20    
21    summaries := make([]*UserSummary, 0, len(ids))
22    for rows.Next() {
23        summary := &UserSummary{}
24        if err := rows.Scan(&summary.ID, &summary.Name, &summary.Email); err != nil {
25            return nil, err
26        }
27        summaries = append(summaries, summary)
28    }
29    
30    return summaries, nil
31}

Monitoring and Alerting

Key Metrics Dashboard

 1type DatabaseMetrics struct {
 2    QueryLatencyP50   time.Duration
 3    QueryLatencyP95   time.Duration
 4    QueryLatencyP99   time.Duration
 5    QueriesPerSecond  float64
 6    ErrorRate         float64
 7    CacheHitRate      float64
 8    ConnectionUtilization float64
 9}
10
11func collectMetrics(db *sql.DB, cache cache.Cache) {
12    ticker := time.NewTicker(time.Second * 10)
13    defer ticker.Stop()
14    
15    for {
16        select {
17        case <-ticker.C:
18            stats := db.Stats()
19            
20            // Connection metrics
21            metrics.Gauge("db.connections.utilization").Set(
22                float64(stats.InUse) / float64(stats.MaxOpenConns))
23            
24            // Query performance metrics
25            queryStats := getQueryStats()
26            metrics.Histogram("db.query.latency").Observe(queryStats.MeanLatency)
27            metrics.Counter("db.queries.total").Add(queryStats.Count)
28            
29            // Cache metrics
30            cacheStats := cache.Stats()
31            hitRate := float64(cacheStats.Hits) / float64(cacheStats.Hits + cacheStats.Misses)
32            metrics.Gauge("cache.hit_rate").Set(hitRate)
33        }
34    }
35}

Automated Performance Alerting

 1# Prometheus alerting rules
 2groups:
 3  - name: database_performance
 4    rules:
 5      - alert: HighQueryLatency
 6        expr: histogram_quantile(0.95, db_query_duration_seconds) > 1
 7        for: 2m
 8        annotations:
 9          summary: "Database query latency is high"
10          
11      - alert: LowCacheHitRate
12        expr: cache_hit_rate < 0.8
13        for: 5m
14        annotations:
15          summary: "Cache hit rate is below optimal threshold"
16          
17      - alert: DatabaseConnectionsHigh
18        expr: db_connections_utilization > 0.8
19        for: 1m
20        annotations:
21          summary: "Database connection pool utilization is high"

Results and Impact

Our systematic approach to database optimization yielded significant improvements:

Performance Improvements

Query latency: 95th percentile reduced from 15s to 50ms
Cache hit rate: Increased from 60% to 95%
Connection pool efficiency: Eliminated connection timeouts
Error rate: Reduced database errors by 99.8%

Business Impact

Page load times: Improved from 5s to under 500ms
User satisfaction: 40% increase in user engagement metrics
Infrastructure costs: 30% reduction in database server requirements
Developer productivity: Faster development cycles with reliable performance

Key Takeaways

Measure first: Use profiling tools to identify actual bottlenecks
Index strategically: Create indexes that support your most common query patterns
Cache intelligently: Implement multi-level caching with proper invalidation
Optimize connections: Right-size connection pools and monitor utilization
Read replicas: Distribute read load to improve overall performance
Monitor continuously: Set up comprehensive metrics and alerting

Database performance optimization is an iterative process. Start with the biggest impact changes (usually indexing and caching), then progressively optimize based on monitoring data and changing usage patterns.

The investment in database performance pays dividends not just in user experience, but also in system reliability, development velocity, and infrastructure costs.