0

System: Debian Squeeze

What's it for: A System where any new disk is DoD Wiped.

I'm looking for a method to fire a command on a disk error that the kernel throws, sometimes we get bad disks and it just needs to be scraped.

Common lines in the logs are

Jan 15 10:34:33 drivekiller9k kernel: [339274.100020] usb 2-3: reset high speed USB device using ehci_hcd and address 51
Jan 15 10:34:33 drivekiller9k kernel: [339274.233729] sd 176:0:0:1: [sdl] Unhandled error code
Jan 15 10:34:33 drivekiller9k kernel: [339274.233733] sd 176:0:0:1: [sdl] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK
Jan 15 10:34:33 drivekiller9k kernel: [339274.233737] sd 176:0:0:1: [sdl] CDB: Write(10): 2a 00 00 34 b8 70 00 00 f0 00
Jan 15 10:34:33 drivekiller9k kernel: [339274.233781] __ratelimit: 20 callbacks suppressed
Jan 15 10:34:33 drivekiller9k kernel: [339274.233815] lost page write due to I/O error on sdl

I would like a method where I could just run a shell script to kill the wiping process and write to a log, what would be the proper method to do this?

slhck
  • 223,558
  • 70
  • 607
  • 592

2 Answers2

0

You can use smartmontools to monitor your disk health, using the provided smartd demon. Some examples.

Atropo
  • 1,623
  • 1
  • 9
  • 10
  • That would work, but all the drives are connected over USB 3.0, so SMART data cannot be accessed –  Jan 15 '13 at 18:46
0

you could use the Shell execute ACTION of rsyslog to trigger actions (e.g. your recovery script) on certain log message entries. See man rsyslog.conf for details.

sparkie
  • 2,238
  • 1
  • 11
  • 11
  • Hrm, that does sound nice. I have switched over to badblocks first pass with a read-write zero test, -- I'll look into rsyslog right away, this looks perfect. -- Now to figure out how to unlock the dying drive while I'm not around – user554005 Jan 16 '13 at 14:13
  • what do you mean by `unlock`? you want to remove/shut down the drive? – sparkie Jan 16 '13 at 16:30
  • Yes, When a drive is in a locked up state, (IOWait) I would like to just eject the drive and kill the wiping process/write to a log – user554005 Jan 17 '13 at 03:41
  • in cases like that I use an USB switch to simply power off the device. Since they removed the code from the kernel to power off the device directly on the host system. Further details are [here](http://superuser.com/questions/176319/hard-reset-usb-in-ubuntu-10-04/528492#528492) – sparkie Jan 17 '13 at 03:59