Infoholics Anonymous: RAID

Wednesday, June 11, 2008

Today's boneheaded Solaris admin move

I converted the /var filesystem of a host I was installing to be a DiskSuite mirror, but forgot that I shouldn't attach the other side of the mirror until the filesystem was mounted through the metadevice.

Further compounding the problem, I didn't restart the system until after the mirror had resynced and I'd done a bunch of other work. After the reboot, a flood of errors from fsck; unsurprisingly, since everything I'd done to the system since I added the mirror had only been written to one half, but DiskSuite (Solaris Volume Manager, I guess, to give it its modern term) thought both mirrors were good, and was randomly reading from the good side or the bad side …

What makes it even more stupid is that I know better.

Given that I'd just installed the system, it was quicker to just re-jumpstart …

Tuesday, January 29, 2008

Odd Solaris DiskSuite problem, and solution

One of the systems I admin had a failed disk that was in use by two DiskSuite RAID 5 volumes (IMO insane, given the performance hit, but not my decision). After the disk was replaced, any attempt to run DiskSuite programs such as 'metastat' gave the following error:

Assertion failed: mdrcp->colnamep->start_blk <= rcp->un_orig_devstart, file ../common/meta_raid.c, line 151
metastat: Abort
Abort (core dumped)

No documentation about this error available anywhere, and a Google only found 3 or 4 hits, none of them helpful (one of them involved using LD_PRELOAD to replace the abort() function to allow 'metaclear' to delete the RAID, recreate it, and reload from backups.

I worked out why the error occurred, though. When the disk was replaced, it of course came with a label/TOC with partitions defined. If these partitions don't match the pre-existing RAID setup, the metadisk tools die a death.

All that was required was to build up the proper partitioning, and then everything worked fine.

Infoholics Anonymous

Blog Archive

About Me

Wednesday, June 11, 2008

Today's boneheaded Solaris admin move

Tuesday, January 29, 2008

Odd Solaris DiskSuite problem, and solution