Erasure Coding Simplified
Let’s be honest — most explanations of erasure coding feel like something ripped out of a PhD thesis.
But here’s the thing: if you’re working with distributed systems, cloud storage, or building anything remotely fault-tolerant at scale, you need to know what erasure coding is, why it exists, and when to use it.
In this post, I’ll break it down — no jargon, just real-world analogies, and a fully worked-out example so you won’t forget it.
💡 First, What Is Erasure Coding?
At its core, erasure coding is a data protection technique.
It’s an alternative to simple replication.
Instead of making multiple copies of your data (which is wasteful), it:
- Breaks data into pieces
- Adds some extra parity pieces
- And can reconstruct the full original even if some parts are missing
If you’ve ever heard of RAID 6, S3 Glacier, HDFS EC, or Reed-Solomon, you’ve already seen erasure coding in action.
🤯 Let’s Take a Real Example
Let’s say we want to store a 60MB file across a distributed system.
We use a coding scheme: k = 6, m = 3
That means:
- We split the file into 6 data chunks
- We add 3 parity chunks
- Total chunks stored = 9
- The system can recover the file from any 6 out of 9 chunks
Here’s how it looks:
Original File: 60MB
Chunks: D1 | D2 | D3 | D4 | D5 | D6
Parity: P1 | P2 | P3
Now let’s say 2 disks fail, and we lose chunks D4 and P2.
No problem. We still have:
- D1, D2, D3, D5, D6
- P1, P3
That’s 7 chunks — and we only need 6. ✅ We can reconstruct D4 from the remaining data using the parity math.
Each parity chunk is a linear combination of the data chunks. Think of it like this (numbers simplified for understanding):
P1 = D1 + D2 + D3 + D4 + D5 + D6
P2 = 2×D1 + 3×D2 + 4×D3 + 5×D4 + 6×D5 + 7×D6
P3 = 3×D1 + 5×D2 + 6×D3 + 7×D4 + 8×D5 + 9×D6
(Actual coefficients are from a matrix over finite fields like GF(2^8), but we’ll keep numbers readable here.)
These are like equations in a system of linear equations.
Now, since we know the values of:
- D1, D2, D3, D5, D6
- P1 and P3
We can plug these values into the equations for P1 and P3.
Let’s say (again simplified):
P1 = D1 + D2 + D3 + D4 + D5 + D6
P3 = 3×D1 + 5×D2 + 6×D3 + 7×D4 + 8×D5 + 9×D6
Now:
- We know the values of D1, D2, D3, D5, D6
- We know the values of P1 and P3
- Only D4 is unknown
This is now just a system of two equations with one unknown (D4)
You solve the equations like high-school algebra:
Example:
Let’s pretend we calculated:
P1 = 60
D1 + D2 + D3 + D5 + D6 = 50
=> D4 = P1 - 50 = 10
Done. You’ve reconstructed D4!
⚠️ Note: In practice, this is done using matrix inversion over a Galois Field, not simple subtraction, but the concept is identical: use parity equations and solve for the unknowns.
🔧 How the Parity Works (Simplified)
Under the hood, it’s math.
Not XOR-based (like RAID 5), but Reed-Solomon codes using finite field algebra (like modulo arithmetic over GF(2^8)).
You don’t need to know the math to use it, but you should know this:
- Parity isn’t just "copies" — it’s calculated
- Rebuilding missing data is deterministic and lossless
- The encoding and decoding cost CPU cycles
⚖️ Pros vs Replication
Feature | Replication (e.g., 3x) | Erasure Coding (e.g., 6+3) |
---|---|---|
Storage Overhead | 200% (store 3x the data) | ~50% (store 1.5x the data) |
Durability | High (depends on copies) | Higher (can survive multiple loss) |
Repair Cost | Low (just copy again) | Higher (need to recalculate data) |
Bandwidth Use | High (for large copies) | Lower |
Performance | Faster (for hot data) | Slower (for cold archival data) |
🚀 Where Erasure Coding Shines
- Cloud storage systems like Amazon S3, Azure Blob, GCP
- Archival tiers (e.g., S3 Glacier) where cost-saving matters
- Distributed file systems like HDFS, Ceph, MinIO
- Large-scale backup systems
It’s especially useful when you have to store petabytes of data and can’t afford 3x duplication.
🧠 When You Shouldn’t Use It
- For hot data that’s accessed and written frequently (high CPU overhead)
- When latency matters more than cost (e.g., user-facing APIs)
- For small data sets where replication is simpler and good enough
- In systems with low CPU and memory capacity (like edge devices)
🧰 Real-World Systems Using Erasure Coding
System | Purpose | EC Role |
---|---|---|
Amazon S3 | Object storage | Uses EC for cost-efficient tiers |
HDFS | Big data storage | Optional EC instead of replication |
Ceph | Distributed file/object block | Native EC pool support |
MinIO | S3-compatible object storage | Uses EC for redundancy |
RAID 6 | Local disk redundancy | Uses Reed-Solomon based EC |
🔚 Conclusion
Erasure coding is one of those things that feels intimidating at first — but once you break it down, it’s just math that saves storage and adds durability.
If you’re designing systems at scale — especially anything distributed, cloud-native, or storage-heavy — you owe it to yourself to understand this concept.
Replication is great for simplicity.
But erasure coding is what the hyperscalers use to win at cost, durability, and scale.
Next time someone asks, “Why not just make 3 copies of everything?”, you’ll have a pretty solid answer.
Until next time — keep learning the stuff that matters.
— The Dev Learnings Team
Member discussion