DevHeads.net

Accidentally nuked my system - any suggestions ?

Hi,

My workstation is running CentOS 7 on two disks (sda and sdb) in a
software RAID 1 setup.

It looks like I accidentally nuked it. I wanted to write an installation
ISO file to a USB disk, and instead of typing dd if=install.iso
of=/dev/sdc I typed /dev/sdb. As soon as I hit <Enter>, the screen froze.

I tried a hard reset, but of course, the boot process would stop short
very early in the process.

Now, I have backups of the important stuff of course, so no real
catastrophe. But it would be nice if I could get back the data from my
disk directly.

I booted a rescue disk (Slax 9.6.4) and I can see my disks as well as
raid arrays /dev/md125, /dev/md126 and /dev/md127. Oh, my partitioning
scheme is manual and quite simple. Everything is RAID 1, I have a /boot
array on /dev/sda1 + /dev/sdb1, swap on /dev/sda2 + /dev/sdb2 and / on
/dev/sda3 + /dev/sdb3.

I tried to mount /dev/sda3 directly from the rescue disk:

# mount /dev/sda3 /mnt

But I only get this:

mount: unknown filesystem type 'linux_raid_member'

I'd be very grateful for suggestions.

Cheers,

Niki

Comments

Re: Accidentally nuked my system - any suggestions ?

By m.roth at 12/04/2018 - 17:12

Nicolas Kovacs wrote:
I think how I'd go about it would be to boot off a rescue disk, then
either try to mount the raid, or just edit the /etc/mdadm.conf, and tell
it only sda, and maybe sdb marked as failed. Then see if you can mount the
raid.

mark

Re: Accidentally nuked my system - any suggestions ?

By Niki Kovacs at 12/04/2018 - 17:50

Le 04/12/2018 à 23:12, mark a écrit :
OK, I got a partial success that's not so bad. The bad news is that the
system won't boot even if I unplug sdb. The good news is I'm currently
retrieving my data.

Once I booted a Slax Live CD with only sda connected, I couldn't mount
it since it's a RAID member. So here's what I did.

# mdadm -Ss
# mdadm -A -R /dev/md9 /dev/sda3
# mount /dev/md9 /mnt

A peek in /mnt, seems like everything's still there. So I'm currently
transferring 300 GB of data to my server.

A word on backups. I have all the vital stuff on my server, with daily
snapshots using Rsnapshot. But all the audio and video stuff is
excluded, not to mention all my settings in Firefox, Thunderbird, etc.

Anyway: thanks very much for your help, guys.

Cheers,

Niki

Re: Accidentally nuked my system - any suggestions ?

By Gordon Messmer at 12/04/2018 - 17:10

On 12/4/18 2:01 PM, Nicolas Kovacs wrote:

The system should boot normally if you disconnect sdb.  Have you tried that?

Re: Accidentally nuked my system - any suggestions ?

By Niki Kovacs at 12/04/2018 - 17:31

Le 04/12/2018 à 23:10, Gordon Messmer a écrit :
Unfortunately that didn't work. The boot process stops here:

[OK] Reached target Basic System.

Now what ?

Re: Accidentally nuked my system - any suggestions ?

By Gordon Messmer at 12/04/2018 - 17:55

On 12/4/18 2:31 PM, Nicolas Kovacs wrote:

Remove "rhgb quiet" from the kernel boot args and see if you get any
more information about what's happening.  "Reached target Basic System."
is recorded twice in the boot logs on a system I checked a moment ago,
so I'm not really sure where yours is stalling.

Re: Accidentally nuked my system - any suggestions ?

By Stephen John Smoogen at 12/04/2018 - 17:50

On Tue, 4 Dec 2018 at 17:30, Nicolas Kovacs < ... at microlinux dot fr> wrote:
In the rescue mode, recreate the partition table which was on the sdb
by copying over what is on sda

sfdisk –d /dev/sda | sfdisk /dev/sdb

This will give the kernel enough to know it has things to do on
rebuilding parts.

Re: Accidentally nuked my system - any suggestions ?

By Niki Kovacs at 12/05/2018 - 00:37

Le 04/12/2018 à 23:50, Stephen John Smoogen a écrit :
Once I made sure I retrieved all my data, I followed your suggestion,
and it looks like I'm making big progress. The system booted again,
though it feels a bit sluggish. Here's the current state of things.

[root@alphamule:~] # cat /proc/mdstat
Personalities : [raid1]
md125 : active raid1 sdb2[1] sda2[0]
512960 blocks super 1.0 [2/2] [UU]
bitmap: 0/1 pages [0KB], 65536KB chunk

md126 : inactive sda1[0](S)
16777216 blocks super 1.2

md127 : active raid1 sda3[0]
959323136 blocks super 1.2 [2/1] [U_]
bitmap: 8/8 pages [32KB], 65536KB chunk

unused devices: <none>

Now how can I make my RAID array whole again? For the record, /dev/sda
is intact, and /dev/sdb is the faulty disk. How can I force
synchronization with /dev/sda?

Cheers,

Niki

Re: Accidentally nuked my system - any suggestions ?

By Stephen John Smoogen at 12/05/2018 - 08:03

On Wed, 5 Dec 2018 at 00:36, Nicolas Kovacs < ... at microlinux dot fr> wrote:
It will because you have 1/2 the bandwidth and there can be a tiny bit
of 'write to 2 disks.. nope. read from disk b, nope switch to a'.

Phil Perry posted all the things in a better email than I could have (pperry++)

Re: Accidentally nuked my system - any suggestions ?

By Phil Perry at 12/05/2018 - 02:31

On 05/12/2018 05:37, Nicolas Kovacs wrote:
If you are confident in the state of sda, I would remove sdb from the
array, copy the partition table from sda to sdb as Stephen suggested
earlier, then add sdb back to the array and allow the data to be synced:

For example:

mdadm --fail /dev/md125 /dev/sdb2
mdadm --remove /dev/md125 /dev/sdb2

mdadm --fail /dev/md126 /dev/sdb1
mdadm --remove /dev/md126 /dev/sdb1

mdadm --fail /dev/md127 /dev/sdb3
mdadm --remove /dev/md127 /dev/sdb3

sfdisk –d /dev/sda | sfdisk /dev/sdb

then add them back and watch then rebuild:

mdadm --add /dev/md125 /dev/sdb2
mdadm --add /dev/md126 /dev/sdb1
mdadm --add /dev/md127 /dev/sdb3

After they have all resynced, I would flush the device buffers for good
measure. For example:

blockdev --flushbufs /dev/sdb1
...

Lastly, don't forget to reinstall grub to sdb:

grub2-install --recheck /dev/sdb

Re: Accidentally nuked my system - any suggestions ?

By Niki Kovacs at 12/05/2018 - 13:49

Le 05/12/2018 à 08:31, Phil Perry a écrit :
Thanks very much for the detailed answer. I'll probably give this a spin
next week, since right now I have an urgent job to finish, and I'm happy
to be able to work on a usable system even though it's a bit sluggish.
As soon as the stress is over, I'll try it out.

cheers,

Niki

Re: Accidentally nuked my system - any suggestions ?

By m.roth at 12/04/2018 - 17:13

Gordon Messmer wrote:
Duh! thanks, Gordon, a simpler answer than mine, with the same effect,
that /dev/sdb failed as far as mdadm was concerned.

mark