Bug 494548 - Partition Manager fails to create new partition table if 'ddf_raid_member' signature is present
Summary: Partition Manager fails to create new partition table if 'ddf_raid_member' si...
Status: REPORTED
Alias: None
Product: partitionmanager
Classification: Applications
Component: general (show other bugs)
Version: 24.08.1
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Andrius Štikonas
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-10-11 15:46 UTC by LaughingMan
Modified: 2024-10-11 22:56 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Screenshot of the error in the GUI (101.15 KB, image/png)
2024-10-11 15:46 UTC, LaughingMan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description LaughingMan 2024-10-11 15:46:39 UTC
Created attachment 174697 [details]
Screenshot of the error in the GUI

Got two factory new drives. They showed up as "No valid partition table found on this device", so I attempted to create a new (GPT) partition table.

Creation fails with the rather unhelpful message "Error". Luckily the failing command is shown: "sfdisk --wipe=always /dev/devicename".

Running that command in the terminal manally gives a few more details:

> This disk is currently in use - repartitioning is probably a bad idea.
> Umount all file systems, and swapoff all swap partitions on this disk.
> Use the --no-reread flag to suppress this check.
> sfdisk: Use the --force flag to overrule all checks.

Since the disk is clearly not currently in use, attempt again with the `--force` flag:

> The device contains 'ddf_raid_member' signature and it may be removed by a
> write command. See sfdisk(8) man page and --wipe option for more details.

No clue what a 'ddf_raid_member' signature is. The disks must have shipped with them from the factory or something. At least I didn't put them there. Anyway, that appears to be the root cause.

dmraid -r confirms that:
> /dev/sdf: ddf1, ".ddf1_disks", GROUP, ok, 31250710528 sectors, data@ 0

GParted shows the file system as "ataraid".  Creating a new partition table with GParted does finish successfully and wipes that signature.

If you need me to get you any further data, please let me know ASAP. So far I've left one disk untouched, but will need to put it into production soon.


EXPECTATION
Creating a new partition table completes successfully and wipes 'ddf_raid_member' signatures if present.
Comment 1 Andrius Štikonas 2024-10-11 21:19:59 UTC
Hmm, I don't think we want to add --force to sfdisk here since it might protect us against trashing something in other cases. Not sure what's the best way to fix this but I guess I should try to reproduce it...
Comment 2 Andrius Štikonas 2024-10-11 21:32:18 UTC
Hmm, I guess what happened in your case is that Linux system noticed that this is raid disk and automatically activated it on /dev/md0.

Can you try to reproduce the same problem, then run
sudo mdadm --stop /dev/md0
and see if that helps.

(That said, it's not yet clear to me what should be the solution.)
Comment 3 LaughingMan 2024-10-11 21:50:31 UTC
/dev/md0 doesn't exist. There's md, md126 and md127.

lsblk lists the device as:
> sdf                                             8:80   0  14,6T  0 disk  
>├─md126                                         9:126  0     0B  0 raid6 
>└─md127                                         9:127  0     0B  0 md
Comment 4 Andrius Štikonas 2024-10-11 22:20:53 UTC
(In reply to LaughingMan from comment #3)
> /dev/md0 doesn't exist. There's md, md126 and md127.
> 
> lsblk lists the device as:
> > sdf                                             8:80   0  14,6T  0 disk  
> >├─md126                                         9:126  0     0B  0 raid6 
> >└─md127                                         9:127  0     0B  0 md

Ok, but the same idea applies, raid got auto activated... So sfdisk noticed that device is used and refused to work on it.
Comment 5 LaughingMan 2024-10-11 22:23:26 UTC
So, should I run
> sudo mdadm --stop /dev/md126
or
> sudo mdadm --stop /dev/md127
?

I'm a little out of my depth here.
Comment 6 Andrius Štikonas 2024-10-11 22:28:26 UTC
(In reply to LaughingMan from comment #5)
> So, should I run
> > sudo mdadm --stop /dev/md126
> or
> > sudo mdadm --stop /dev/md127
> ?
> 
> I'm a little out of my depth here.

Probably both, though I'm not an expert on raid. But at least lsblk suggest that both are somehow derived from /dev/sdf
Comment 7 LaughingMan 2024-10-11 22:39:24 UTC
Ok:
- Stopped md126.
- Tried creating a partition table -> Failed
- Stopped md127
- Tried creating a partition table -> Success

Maybe Partition Manager should detect this case and run that stop command on behalf of the user? Possibly after another confirmation dialogue.
Something like "The disk you're trying to modify is currently mounted as a RAID. Unmount now? [ ] Yes [ ] No"
Comment 8 Andrius Štikonas 2024-10-11 22:42:55 UTC
(In reply to LaughingMan from comment #7)
> Ok:
> - Stopped md126.
> - Tried creating a partition table -> Failed
> - Stopped md127
> - Tried creating a partition table -> Success
> 
> Maybe Partition Manager should detect this case and run that stop command on
> behalf of the user? Possibly after another confirmation dialogue.
> Something like "The disk you're trying to modify is currently mounted as a
> RAID. Unmount now? [ ] Yes [ ] No"

Perhaps, though it's not clear how to implement.

Anyway, for now I'll leave this open, I think we've gathered enough data to root cause it. So you can put your disk to production.
Comment 9 Andrius Štikonas 2024-10-11 22:44:56 UTC
(In reply to Andrius Štikonas from comment #8)
> (In reply to LaughingMan from comment #7)
> > Ok:
> > - Stopped md126.
> > - Tried creating a partition table -> Failed
> > - Stopped md127
> > - Tried creating a partition table -> Success
> > 
> > Maybe Partition Manager should detect this case and run that stop command on
> > behalf of the user? Possibly after another confirmation dialogue.
> > Something like "The disk you're trying to modify is currently mounted as a
> > RAID. Unmount now? [ ] Yes [ ] No"
> 
> Perhaps, though it's not clear how to implement.
> 
> Anyway, for now I'll leave this open, I think we've gathered enough data to
> root cause it. So you can put your disk to production.

There is actually an old branch of kpmcore (raid-support) that does have some knowledge of mdadm (but it was never merged to master). Perhaps I'll see if there is anything there that helps with this.
Comment 10 LaughingMan 2024-10-11 22:56:05 UTC
(In reply to Andrius Štikonas from comment #8)
> I think we've gathered enough data to root cause it. So you can put your disk to production.

Cool. Although in case that wasn't clear: My testing was necessarily destructive. The creation of the new partition table only fails after hitting "Apply" and confirming. Either it fails or the disk gets modified. Since my test earlier succeeded, the raid stuff is already gone.