What is Sharding?
Turkish: Sharding
Sharding splits a large dataset into independent pieces called shards, distributing read and write load across multiple servers.
What is Sharding?
Sharding is a technique for horizontally partitioning data when a single database server can no longer handle the workload. Each piece, or shard, stores a specific part of the dataset. Read and write load can then be spread across multiple machines.
For example, customer records may be split across four shards by customer ID. When the application needs a user profile, it first determines which shard owns that ID and sends the query only to that shard.
Shard Keys and Approaches
- Hash-based sharding: The key is hashed to distribute data evenly.
- Range sharding: Data is split by ranges such as date, ID, or region.
- Directory sharding: A separate lookup table maps records to shards.
- Resharding: The number or distribution of shards changes as the dataset grows.
Business Use
Sharding is used in large marketplaces, multi-tenant SaaS platforms, log analytics, and high-volume messaging systems when one database has reached its practical limits. Databases such as MongoDB provide built-in sharding support.
Applied too early, sharding adds avoidable complexity. A poor shard key can create hotspots, cross-shard queries may become slow, and backup or data movement operations become harder. Replication improves availability, while sharding focuses on expanding capacity horizontally.
Related Terms
MongoDB is a document-oriented NoSQL database using BSON documents, flexible schemas, indexes, replication, and horizontal scaling.
Database ReplicationDatabase replication copies data across multiple servers to improve availability, redundancy, and read capacity without moving the primary workload.