[RndTbl] expert RAID question
trevor at tecnopolis.ca
Sat Apr 25 00:30:32 CDT 2015
On 2015-04-24 Adam Thompson wrote:
> I think the author screwed up. The write changes exactly one
> "stripe", not exactly one disk.
Ya, but (allow me to elaborate) he was trying to make a distinction
between 3 different RAID 5 write scenarios:
1. write to exactly one stripe width (so exactly one chunk on each disk)
2. write to 2 to n-1 chunks in the same stripe (write hole)
3. write to just 1 chunk to just 1 disk in 1 stripe
It's #3 that I was describing in my initial email. He seems to think
that #3 is somehow different than #2. This author isn't stupid, so I'm
just trying to see if there's something I am missing. That maybe there
is some way to write just 1 chunk and 1 parity chunk without reading
the whole stripe (assuming nothing is cached). Normally when an author
gets it wrong, you can intuit the thought/editing process and figure
out exactly where they made a wrong turn. But in this case I can't
fathom any sense out of this at all. How could #3 ever not just be the
same as #2?
> Parity blocks must be recalculated on every write.
Ya, that's what I thought/think.
I just have a strange feeling he's outsmarted all of us and just done a
poor job explaining it. But if there was a supersmart algorithm for
scenario #3 we'd probably know about it via discussions regarding its
implementation in linux, etc. I guess I can hit the source... unless
it's just some lone weird hardware raid adapter that has this feature.
Before I post an errata I just wanted to make sure I wasn't missing
> Also, write performance is not necessarily any
> percentage of a single disk - it depends on the speed of the parity
> calculations. RAID5 can be anywhere from 0% to infinity% as fast as
> a single disk depending on the array.
He's assuming parity calc / CPU is irrelevant (mostly true these days),
it's slow disk (assuming rust for simplicity) that is the bottleneck.
Also, it's a "performance" book so he has to put numbers on
everything. He seems to pull many out of thin air. Maybe it a "rule
of thumb" type thing.
> I think the author read something about RAID 5 partial-stripe writes
> and misinterpreted it. OTOH, if he's talking about ZFS "RAIDZ",
> which is analogous to RAID 5, then he's sort-of correct (but still
> not 100%).
This is years before ZFS (it's an older book) so definitely just plain
jane RAID 5.
More information about the Roundtable