- Vertical scaling - Scaling up by increasing resources on a single machine
- Horizontal scaling - Scaling out by adding more machines (nodes)
Vertical scaling (scaling up)
Vertical scaling, or scaling up, involves increasing the resources of a single database instance. In cloud environments like Amazon RDS, this typically means:- Increasing CPU capacity
- Adding more RAM
- Upgrading to faster storage (e.g., from gp2 to io1 in AWS)
- Expanding storage capacity
Advantages of vertical scaling
- Avoids modifications to application logic or database schema
- Maintains data consistency and ACID properties
- Suitable for workloads that require strong consistency and complex transactions
Horizontal scaling (scaling out)
Horizontal scaling involves distributing the database load across multiple servers. Few truly horizontally scalable databases exist, but these databases run on multiple nodes, typically distributing queries across multiple compute nodes and using a mix of shared storage layer along with locally cached data. Traditional databases like PostgreSQL are not designed for native horizontal scaling, but some strategies can be employed:- Read replicas - Offload read operations to multiple read-only database instances
- Sharding - Splitting data across multiple database instances based on a partition key
- Connection pooling - Efficiently manage and distribute database connections
Advantages of horizontal scaling
- Scale beyond the limits of a single machine
- Improved fault tolerance and availability
- More cost-effective at large deployments
Migrating to horizontally scalable solutions
When vertical scaling reaches its limits, organizations may need to migrate to a horizontally scalable solution. This process can be disruptive:Downtime
Migration often requires significant downtime for data transfer and verification.
Application changes
Code may need to be rewritten to accommodate new data access patterns.
Data consistency challenges
Ensuring data integrity across a distributed system is complex.
Performance tuning
New bottlenecks may emerge in a distributed environment.
Operational complexity
Managing a distributed database requires new skills and tools.
Cost of migration
Both in terms of new infrastructure and engineering effort.