[UCI-Linux] More cautionary tales from RAID-land.

Harry Mangalam hjm at tacgi.com
Fri May 13 12:04:15 PDT 2005


5.12.05 - disk failure on sand's RAID5 partition - detected on reboot in 
dmesg, not by log or email as expected.  At this point, the data was still 
intact and SHOULD have been backed up to another system, but since it was HW 
RAID5 AND it was 3ware controller (known for reliability (HA!) and robustness 
(HA!)) AND this was acting as the backup for other systems (which were still 
OK) AND the data was ~200GB at this point, I thought it was ok to go ahead. 
MISTAKE! 

First thing was to find the problem of why we hadn't been informed of the 
failure beforehand.  
The controller ( a 3ware Escalade 86506-8port driving 8x250GB identical WD 
disks) has, like most such cards, a BIOS-based utility for setting up the 
RAID which actually worked pretty well, except that unlike SW RAID, you can't 
use the raid immediately in degraded mode (while it's building the checksum 
info across the raid) - you have to let it sit there for hours (it's a 1.6 TB 
array) while it checksums the entire array (even tho there's nothing on it to 
begin with).  That done, it looks like a giant scsi disk to the OS - so far 
so great.

3ware also comes with a web-frontend utility called 3dm and a commandline 
utility called tw_cli.  When I had installed the 3dm, I had gone thru the 
installation script, checked that there were no error messages, checked that 
I got an email verification and then forgot about it - altho thinking about 
it - I must have gotten the email from the script, not the app.  I did not 
check that the web server interface was working as I didn't think I'd ever 
use it.  MISTAKE.

Now I DID need to talk to the controller and the 3dm/tw_cli were the only 
things that could while the OS was running.  THIS is one of the downsides of 
a hardware RAID - you're stuck with the tools that the vendor gives you.  
Since I was running on a 64-bit SMP Linux (Ubuntu), dual opteron, the 
installation bash script ran fine, but the monitoring daemon silently failed 
(32 bit code and I was running a 64-bit-only OS).  So nothing was hearing the 
controller screaming that a disk had died and the RAID was now running in 
degraded mode.  (as noted above, the only thing that let us know this was an 
entry in dmesg on a reboot.)

After verifying that this software was in fact incompatible with the OS, I 
tried to find an upgrade that WOULD let me talk to the controller.  I figured 
that 3ware being a vendor of high-end hardware, my kind of machine would be 
among their main targets.  And I was right - BUT ... Trying to find the 
software that was compatible with my system was an exercise in frustration - 
3ware's web site is walled off from google's bots (like almost all corporate 
sites) and since 3ware is relatively high end hardware, there are not a lot 
of messages on the linux BBSs about such failures and how to deal with them.  
So after a couple hours of browsing I had to go back to the 3ware site and 
deal with their oh-so cool web design that doesn't show URLs in any way 
different than regular text.  The text only shows up as a hyperlink if you 
mouse over it.  I noticed this and then had to mouse over entire pages of 
text, line by line to search for likely hyperlinks.  

The one that finally took me to the page I needed was buried in a paragraph 
that I almost overlooked.  Turns out the SW does exist, but is NOT specified 
for the controller I have (8506-8) but  the 9000-series controller (which is 
noted in the fine print as being backwards compatible with the 8000-series).  
ALso, it's not 'released SW' , it's being 'In Engineering Phase'.  To make a 
long story shorter, I ended up downloading and trying several versions of 
software until I finally stumbled over the right software - the 64bit 
versions of the 3dm2 and cli for the 9000-series controllers.  This installed 
ok and apparently ran.  The web interface software however, while it started 
up and presented an optimistic login screen, gave no indication of what the 
passwords should be or where to go to set them.  After looking in the config 
file (/etc/3dm2/3dm2.conf) only to find encrypted passwords, I then wandered 
around the 3ware web site trying to find documentation about how to set or 
even find out what the initial passwords were. There were no docs or help 
files or README's with teh software (it's '3ware' for those of you going thru 
the same hell; you can change them via the web interface when you finally get 
in.)

Re the passwords - nothing - or at least nothing I could find in about an 
hour's searching.  I finally decided to look in the installation script - 
bingo.  The passwords are set and encrypted into the config file from there.  
SO after setting them to what I wanted, FINALLY I was able to log into the 
web interface and talk to the controller. And in fact after being able to log 
in, the help file DOES tell you what what the password is and how to change 
it.

Actually the tw_cli app also works, but it's pretty ugly (altho give them 
credit - they DID make 2 linux-specific clients).  The one that I needed was 
the 3ware 9000 series 3DM2 Linux64-bit one - helpfully, on the web page I 
eventually found: (http://3ware.com/support/downloadpageeng.asp?SNO=4), both 
the 32 bit and 64 bit one are named the SAME THING.

So here I am, talking to the 3ware controller via the web interface and while 
it's not fantastic, it's really not bad.  And one disk has been marked bad.  
So now I have to replace the bad disk.  I'm just about to bring the system 
down to do this, when I realize the disks are sitting in the expensive 
hotswap cages we bought for this specific purpose, so (after unmounting the 
filesystem) I take a deep breath, and pop the offending disk. ... ... nothing 
happens - the system doesn't freeze or explode or anything - it looks like it 
has actually worked - and the 3dm2 interface shows that the bad disk is now 
gone.  GREAT!  I quickly replace the disk with a spare and slam it back in 
again - and there it is on the web interface.  Now isolated all by itself. 

 Now - how to go about adding it back to the RAID?  The Web interface is a bit 
dodgy on how to go about adding this disk back into the array. And the help 
pages are not particularly helpful; the Maintenance help page sort of 
obliquely refers to this scenario, but certainly doesn't give any specific 
step-by step instructions.  You'd have thought that since one of the primary 
reasons for buying such a $$$ controller is to be able to replace a RAID5 
disk on the fly, they might have a specific  mention of such an eventuality.   
The way I did it is to add the disk to a new 'UNIT' and then add that UNIT to 
the previously defined RAID5 UNIT and request that the new combined unit be 
rebuilt.  That seemed to work and the controller went about integrating the 
new disk into the raid 5 array.  Again, it was not possible to mount the 
array and use it while it was being integrated, like you can do with SW RAID 
under linux.  This took several hours, and in the end, it FAILED.  That was 
the just about the last &^%@$@(^$& straw.  After spending $ and time (=$) on 
this escapade (that's what the Escalade series SHOULD be named), the 
*&^&^%!#! thing fails to rebuild the array.  (But at least it now reports via 
email that it has failed.) So now what???  

The filesystem was a reiserfs to begin with.  As a last resort, I try to 
rescue the thing with a fsck.reiserfs.  After reading the dire warning about 
this being the last thing you should try, I give it the --rebuild-tree option 
and go home.  This being 2TB of disk, it takes a while.  Later that night I 
see it's completed and try to mount it.  To my astonishment it mounts.  I do 
a 'df' - hmm - that's not good - only 3% of disk used.  There was 11% when I 
started (the raid had only been running a short while). I'm not at all happy 
to see that it the only directory on the partition is ... lost+found.  This 
dir contains the rubble of what used to be about 200GB of expensive and 
carefully groomed earth-sensing and atmospheric data. %$^%$^$^%$^!^&$!^%

So go ahead - ask me - Am I happy that I spent the extra $ to buy a hardware 
raid card rather than two $30 4port sata controllers and using SW RAID?  

I probably couldn't have done all this disk hot-swapping with a non-HW RAID 
card, but the cost of a reboot for most of us is not that big a deal.  That 
said, I'm not sure of the total complexity that doing such a thing under SW 
RAID would have entailed. To do this with 2x 4 port controllers would have 
required additional complexity and I'm not sure it can be done easily with 
mdadm.  And it is possible that I did something wrong in the 3ware rebuild - 
I'll be sending this narrative back to them as well.

As a postscript to this, I should also mention that while most Linux server 
vendors sell 3ware cards, at least one (Los Alamos Computers) suggests SW 
raid as being both significantly cheaper and faster.  They suggest the 
Promise SATA TX4 for about $70.  Newegg has the supported-in-kernel Silicon 
Image chipsetted Syba 4 port card for $30. If you remember my previous posts, 
I was surprised to find SW RAID to be a bit (10-20%) faster.  I now think 
that I probably should have tried the SW RAID on a full 2TB array.

Well, you makes your choices and you takes your chances.

Hope this helps.

hjm



More information about the UCI-Linux mailing list