[RndTbl] raid 1 on partitions on raid 0 on paritions; booting; bug??

Trevor Cordes trevor at tecnopolis.ca
Sun Oct 25 17:32:09 CDT 2015


I'm soliciting opinions on whether this is a bug or not.

I had this wacky setup on my Fedora 21 (don't ask why!):

RAID1

on top of

1st HALF:	2nd HALF:
partitions	partitions

on top of	on top of

RAID0		raw disk

on top of

partitions

on top of

raw disk

And it all worked great.  Until I rebooted.  Then the RAID1's would come 
up but degraded, with only their non-RAID0 half ("2nd HALF" above) in the 
array, and the RAID0 parts "gone".  I could then re-add the RAID0 half, 
reboot and get the same thing.

No matter what I did (messing with mdadm, dracut, grub2, etc) I couldn't 
get these things to assemble properly on boot.

Lots of bugs on the net about simpler cases of nested RAID having the same 
problem, and many bz's were fixed years ago regarding this.  I checked, 
and their fixes are in my distro.

I got some help from the old bug guys and when I redid my setup to be 
layered like this instead:

RAID1
1st HALF:	2nd HALF:
RAID0		partitions
paritions	raw disk
raw disk

... it all magically worked, and they came up on boot, and the bug 
disappeared.

The only difference being whether I partition my RAID0 array or make 
partitions and then put the multiple RAID0's separately, directly into 
each RAID1 array.  (The reason I didn't want to do this in the first place 
is I was making 5 of these groupings and I didn't want to manage 10 
arrays, just 6.  And it was only temporary anyhow.  I know the buggy 
setup is bad design.)

My question is, is this a bug I should report?  In theory, in my mind, the 
RAID1 on partitioned RAID0 should work fine.  The fact mdadm and the 
kernel happily support it leans me towards answering "yes".  If I can have 
it live in the kernel, why not have it survive reboots?  I thought you 
could nest arbitrary combinations and levels (to X depth) of md, lvm, 
paritions, etc.  (And, yes, I really tried everything to make it boot, 
including insanely detailed mdadm.conf and grub2 boot lines, to minimal 
configs, and yes I have partition type fd.)  However, I want to make sure 
I'm not doing something here that is completely insane and shouldn't be 
supported.

It appears dracut, udev and mdadm are responsible for all of this.  That's 
where the other similar bugs were fixed.

Details:
Here's what it looked like when it was buggy.  md9 was the big RAID0 array 
that was then parititioned into 4.  Note how md9 does come up, get 
recognized, but then the boot stuff doesn't "recurse" into those 
partitions to see that they themselves are array components.  So then 
md126 (root by the way) only comes up with 1 or 2 components, after an 
annoyingly long delay.

Oct 23 02:37:57 pog kernel: [   19.542533] sd 8:0:3:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 23 02:37:57 pog kernel: [   19.542624] sd 8:0:2:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 23 02:37:57 pog kernel: [   19.553021]  sdb: sdb1 sdb2
Oct 23 02:37:57 pog kernel: [   19.553835]  sda: sda1 sda2 sda3 sda4
Oct 23 02:37:57 pog kernel: [   19.554991]  sdc: sdc1 sdc2
Oct 23 02:37:57 pog kernel: [   19.556918] sd 8:0:2:0: [sdb] Attached SCSI disk
Oct 23 02:37:57 pog kernel: [   19.558970] sd 8:0:3:0: [sdc] Attached SCSI disk
Oct 23 02:37:57 pog kernel: [   19.559332] sd 8:0:0:0: [sda] Attached SCSI disk
Oct 23 02:37:57 pog kernel: [   19.610787] random: nonblocking pool is initialized
Oct 23 02:37:57 pog kernel: [   19.737894] md: bind<sdb1>
Oct 23 02:37:57 pog kernel: [   19.742379] md: bind<sdb2>
Oct 23 02:37:57 pog kernel: [   19.744213] md: bind<sdc2>
Oct 23 02:37:57 pog kernel: [   19.748375] md: raid0 personality registered for level 0
Oct 23 02:37:57 pog kernel: [   19.748619] md/raid0:md9: md_size is 285371136 sectors.
Oct 23 02:37:57 pog kernel: [   19.748623] md: RAID0 configuration for md9 - 1 zone
Oct 23 02:37:57 pog kernel: [   19.748625] md: zone0=[sdb2/sdc2]
Oct 23 02:37:57 pog kernel: [   19.748631]       zone-offset=         0KB, device-offset=         0KB, size= 142685568KB
Oct 23 02:37:57 pog kernel: [   19.748633]
Oct 23 02:37:57 pog kernel: [   19.748650] md9: detected capacity change from 0 to 146110021632
Oct 23 02:37:57 pog kernel: [   19.752284]  md9: p1 p2 p3 p4
Oct 23 02:37:57 pog kernel: [   19.776482] md: bind<sda1>
Oct 23 02:37:57 pog kernel: [   19.786343] md: raid1 personality registered for level 1
Oct 23 02:37:57 pog kernel: [   19.786643] md/raid1:md127: active with 2 out of 2 mirrors
Oct 23 02:37:57 pog kernel: [   19.786669] md127: detected capacity change from 0 to 419364864
Oct 23 02:37:57 pog kernel: [   19.823529] md: bind<sda2>
Oct 23 02:37:57 pog kernel: [  143.320688] md/raid1:md126: active with 1 out of 2 mirrors
Oct 23 02:37:57 pog kernel: [  143.320733] md126: detected capacity change from 0 to 36350984192


More information about the Roundtable mailing list