‹ Back to Blog Engineering

Database Design Patterns for Scalable Applications

March 27, 2026 · 10 min read
Database analytics

The database is usually the first bottleneck you hit as an application scales. CPU and memory are easy to add; restructuring a database schema under load is not. The design decisions you make early, normalisation level, indexing strategy, read/write separation, determine how far your application can grow before requiring fundamental rearchitecture.

This article covers the database design patterns that we apply at Pepla across our client projects, from startups handling their first thousand users to enterprise systems processing millions of transactions daily.

Normalisation vs Denormalisation

Database normalisation is the process of organising data to reduce redundancy and improve integrity. The normal forms (1NF through 5NF) provide a systematic framework for this. In practice, most applications target Third Normal Form (3NF), which eliminates transitive dependencies while remaining practical to work with.

Normalise for transactional workloads, denormalise for read-heavy analytics -- the right default depends on access patterns.

Database server

When to Normalise

Normalisation is the right default for transactional (OLTP) workloads. When data is updated frequently, redundancy creates consistency problems. If a customer's address is stored in both the customers table and the orders table, updating it requires touching both locations. Miss one and you have inconsistent data.

Normalised schemas are also more flexible. Adding a new relationship or changing a business rule typically requires adding or modifying fewer tables.

When to Denormalise

Denormalisation trades storage space and write complexity for read performance. When a query requires joining six tables to produce a result, a denormalised view or materialised table that pre-computes the join can reduce query time from seconds to milliseconds.

Common denormalisation patterns:

Normalise your transactional data. Denormalise your read paths. This is not a contradiction; it is a recognition that reads and writes have fundamentally different performance characteristics.

Indexing Strategies

Indexes are the single most impactful performance optimisation in relational databases. A missing index can turn a millisecond query into a minute-long table scan. But indexes are not free: they consume storage, slow down writes, and require maintenance.

Primary Key Indexes

Every table needs a primary key, and that key should be indexed (most databases do this automatically). For surrogate keys, use auto-incrementing integers or UUIDs. The choice matters: auto-incrementing integers are smaller and produce sequential inserts that are friendly to B-tree indexes. UUIDs (specifically UUIDv7, which is time-ordered) are better for distributed systems where a central sequence generator is impractical.

Composite Indexes

Indexes on multiple columns serve queries that filter or sort on those columns. Column order matters: the index on (status, created_at) efficiently serves queries filtering by status and then sorting by date, but not queries filtering only by date. Design composite indexes to match your most common query patterns, with the most selective column first.

Covering Indexes

A covering index includes all columns needed by a query, allowing the database to satisfy the query entirely from the index without reading the table data. This is particularly powerful for frequently executed queries that read only a few columns from wide tables.

Partial Indexes

Index only the rows that matter. An index on orders WHERE status = 'pending' is smaller and faster than an index on all orders, and it perfectly serves the most common query pattern (finding pending orders). PostgreSQL supports partial indexes natively.

Index Maintenance

Regularly review your indexes. Unused indexes waste storage and slow writes. Most databases provide usage statistics for indexes. At Pepla, we schedule quarterly index reviews for all client databases, dropping unused indexes and adding indexes for new query patterns identified through slow query logs.

Index only what your queries need. Every unnecessary index slows writes for no read benefit.

Read Replicas

Most applications read far more than they write, often at a 90:10 or 95:5 ratio. Read replicas exploit this asymmetry by replicating data from a primary (write) database to one or more secondary (read) databases.

Query analytics

Implementation considerations:

Connection Pooling

Database connections are expensive resources. Establishing a connection involves TCP handshake, TLS negotiation, and authentication. Maintaining idle connections consumes memory on the database server. A web application with 100 instances, each opening 10 connections, consumes 1,000 database connections, which can exceed the database's connection limit.

Connection pooling maintains a pool of reusable connections, lending them to application threads on demand and returning them when the work is done.

Configure pool sizes carefully. Too small and your application queues waiting for connections. Too large and you overwhelm the database. A good starting point is 2-3 connections per CPU core on the database server, divided across all application instances.

Connection pooling prevents database connection storms that bring down production under load.

Query Optimisation

Before optimising, measure. The database's query execution plan (available via EXPLAIN ANALYZE in PostgreSQL or the execution plan viewer in SQL Server Management Studio) tells you exactly what the database is doing and where time is spent.

Common Optimisation Patterns

The most impactful query optimisation is usually not rewriting the query. It is adding the right index. Before spending an hour rewriting SQL, spend five minutes checking whether the query has appropriate index support.

Caching Layers

Caching reduces database load by storing frequently accessed data in a faster medium (typically in-memory). The standard caching architecture uses Redis or Memcached as a distributed cache between the application and the database.

Cache Strategies

Cache invalidation is the hard part. When the underlying data changes, the cache must be updated or invalidated. Time-based expiry (TTL) is the simplest approach. Event-driven invalidation (invalidate the cache when a relevant write occurs) is more accurate but more complex.

When to Use NoSQL

NoSQL databases are not a replacement for relational databases. They are a complement, optimised for specific access patterns that relational databases handle poorly.

The key question is: "What are my primary access patterns?" If you primarily query by primary key or a simple set of attributes, a NoSQL database may be a better fit. If you need complex queries, aggregations, and transactions across multiple entities, a relational database is almost certainly the right choice.

Use the expand-and-contract pattern for migrations -- never make breaking schema changes in a single step.

Migration Strategies

Schema migrations are one of the riskiest operations in production databases. A poorly executed migration can cause downtime, data loss, or data corruption. Here are the practices that make migrations safe.

Expand-and-Contract Pattern

Never make breaking schema changes in a single step. Instead:

This approach allows zero-downtime migrations because the old and new structures coexist during the transition period.

Tooling

Use a migration framework (Flyway, Liquibase, EF Core Migrations, Alembic) that tracks which migrations have been applied and ensures they run in order. Every migration should be versioned, reversible, and tested against a copy of the production schema before execution.

Our database architects have designed schemas for financial systems processing millions of transactions, election platforms handling national-scale data, and IoT tracking systems with real-time ingestion.

Database design is not a one-time activity. It evolves with your application. The patterns described here provide a toolkit for making that evolution deliberate and safe, ensuring that your database remains an enabler rather than a bottleneck as your application grows.

Need help with this?

Pepla designs and optimises databases that scale. Let us review your data architecture.

Get in Touch

Contact Us

Schedule a Meeting

Book a free consultation to discuss your project requirements.

Book a Meeting ›

Let's Connect