Captain's Universe Home
Captain's Universe Home
Cosmic Ray Muon DetectorTeleGarden Pages
Time on MarsBryophyllum Plants
Jupiter Radio AstronomyAncient Pages
Salzburg Tourist GuideEarth Magnetometer
  H O M E     AJAX & MORE     LINUX & MORE     RTAI     XENOMAI     ADEOS IPIPE      
    JAVA & BROWSERS     *NIX     ELECTRONICS     REVIEWS     ARTEMIA     FAIRY SHRIMP      


IDE harddisk errors: DriveReady SeekComplete Error status=0x51 DriveStatusError error=0x04

Ever seen some of those error messages in the kernel log file ?
kernel: hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
kernel: hda: drive_cmd: error=0x04 { DriveStatusError }
I have (well, only a few times per month) and here are my conclusions.

Before I start: If you saw such errors too, and/or other errors and your harddrive died afterwards, I'd be more than happy if you drop me an email with the errors. Thanks! :-)


Also see: Linux Harddisk Monitoring with SmartMonTools (smartctl)

Since anyone on any maillist or forum says something different about the harddisk error messages, I started my search in the kernel:
captain:/usr/src/kernel-source-2.6.8# grep -R DriveStatusError *
drivers/ide/legacy/hd.c:                if (hd_error & ABRT_ERR)        printk("DriveStatusError ");
drivers/ide/ide.c:                      if (err & ABRT_ERR)     printk("DriveStatusError ");
drivers/ide/ide-disk.c:         if (err & ABRT_ERR)     printk("DriveStatusError ");
drivers/ide/Kconfig:      hda: set_multmode: error=0x04 { DriveStatusError }
"DriveStatusError" appears in 4 files in the kernel.
Looking into drivers/ide/ide.c reveals that those strings are in the method ide_dump_status:
/*
 * Error reporting, in human readable form (luxurious, but a memory hog).
 */
u8 ide_dump_status (ide_drive_t *drive, const char *msg, u8 stat)
{
[...]

    local_irq_set(flags);
    printk(KERN_WARNING "%s: %s: status=0x%02x", drive->name, msg, stat);
#if FANCY_STATUS_DUMPS
    printk(" { ");
    if (stat & BUSY_STAT) {
      printk("Busy ");
    } else {
      if (stat & READY_STAT)  printk("DriveReady ");
      if (stat & WRERR_STAT)  printk("DeviceFault ");
      if (stat & SEEK_STAT) printk("SeekComplete ");
      if (stat & DRQ_STAT)  printk("DataRequest ");
      if (stat & ECC_STAT)  printk("CorrectedError ");
      if (stat & INDEX_STAT)  printk("Index ");
      if (stat & ERR_STAT)  printk("Error ");
    }
    printk("}");
#endif  /* FANCY_STATUS_DUMPS */

Above we have the output of the IDE status register bits. In our case, DriveReady just means the drive is ready. Nothing to worry! SeekComplete means the seek operation requested by the previous IDE command was completed. Still nothing to worry! But Error means something out of the ordinary happened. No reason to panic yet, but start worrying a bit.

    printk("\n");
    if ((stat & (BUSY_STAT|ERR_STAT)) == ERR_STAT) {
      err = hwif->INB(IDE_ERROR_REG);
      printk("%s: %s: error=0x%02x", drive->name, msg, err);
#if FANCY_STATUS_DUMPS
      if (drive->media == ide_disk) {
        printk(" { ");
        if (err & ABRT_ERR)   printk("DriveStatusError ");
        if (err & ICRC_ERR)   printk("Bad%s ", (err & ABRT_ERR) ? "CRC" : "Sector");
        if (err & ECC_ERR)  printk("UncorrectableError ");
        if (err & ID_ERR)   printk("SectorIdNotFound ");
        if (err & TRK0_ERR)   printk("TrackZeroNotFound ");
        if (err & MARK_ERR)   printk("AddrMarkNotFound ");
        printk("}");
        if ((err & (BBD_ERR | ABRT_ERR)) == BBD_ERR || (err & (ECC_ERR|ID_ERR|MARK_ERR))) {
            if ((drive->id->command_set_2 & 0x0400) &&
[...]
              printk(", LBAsect=%llu, high=%d, low=%d",
                 (long long) sectors,
                 high, low);
            } else {
              u8 cur = hwif->INB(IDE_SELECT_REG);
              if (cur & 0x40) {   /* using LBA? */
                printk(", LBAsect=%ld", (unsigned long)
[...]
              } else {
                printk(", CHS=%d/%d/%d",
[...]
            if (HWGROUP(drive) && HWGROUP(drive)->rq)
              printk(", sector=%llu", (unsigned long long)HWGROUP(drive)->rq->sector);
        }
}

The next section above shows the error message output. The only halfway "normal" error is DriveStatusError. This just means that there is a status error. All other errors like bad sector, uncorrectable error, sector id not found, track zero not found and address mark not found, mean that there is something bad going on and you should immediately backup data (if you've got no recent backup) and replace the harddrive.


In case of our DriveStatusError the kernel file include/linux/hdreg.h sheds some more light onto it (also for the other errors):
/* Bits of HD_STATUS */
#define ERR_STAT                0x01
#define INDEX_STAT              0x02
#define ECC_STAT                0x04    /* Corrected error */
#define DRQ_STAT                0x08
#define SEEK_STAT               0x10
#define SRV_STAT                0x10
#define WRERR_STAT              0x20
#define READY_STAT              0x40
#define BUSY_STAT               0x80

/* Bits for HD_ERROR */
#define MARK_ERR                0x01    /* Bad address mark */
#define TRK0_ERR                0x02    /* couldn't find track 0 */
#define ABRT_ERR                0x04    /* Command aborted */
#define MCR_ERR                 0x08    /* media change request */
#define ID_ERR                  0x10    /* ID field not found */
#define MC_ERR                  0x20    /* media changed */
#define ECC_ERR                 0x40    /* Uncorrectable ECC error */
#define BBD_ERR                 0x80    /* pre-EIDE meaning:  block marked bad */
#define ICRC_ERR                0x80    /* new meaning:  CRC error during transfer */

First the status byte: 0x51 = 01010001b
0 READY_STAT 0 SEEK_STAT 0 0 0 ERR_STAT
As said above, the drive is ready, seek is complete, but there is an error.

The error byte: 0x04 = 00000100b
0 0 0 0 0 ABRT_ERR 0 0
Here, in the IDE error register, we only have ABRT_ERR = "Command aborted".


CONCLUSIONS:

kernel: hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
kernel: hda: drive_cmd: error=0x04 { DriveStatusError }
This says that there was an error on the harddrive, and that the command was aborted ("Command aborted"). As far as I have found, such errors once in a while mean nothing serious. As long as there are no "Uncorrectable ECC error"s or other grave errors like checksum error, bad address mark etc. it should be nothing to worry about. Such "aborted commands" occur e.g. when an unknown sector is requested, that is not present on the harddisk, buggy drivers (-> the driver sent a command that was not understood by the drive).

Another evidence that the DriveStatusError (command aborted) is harmless is that the SmartMonTools (Linux Harddisk Monitoring with SmartMonTools (smartctl)) don't report any non-zero RAW_VALUES for Reallocated_Sector_Ct, Seek_Error_Rate, Reallocated_Event_Count, Offline_Uncorrectable, UDMA_CRC_Error_Count, Multi_Zone_Error_Rate or Hardware_ECC_Recovered etc., so there were not serious errors on the harddrive, but the command was just not executed or understood by the disk.
Maybe the firmware on the harddrive is buggy.

If I'm in error somewhere, please let me know! It is for the benefit of any linux user if this page is accurate as possible. Thanks a lot in advance!

And last but not least, don't blame me if your harddrive dies!

Also see: Linux Harddisk Monitoring with SmartMonTools (smartctl)

Last-Modified: Fri, 31 Mar 2006 22:15:32 GMT

Google
 
Web www.captain.at
go to top
© 1996-2010 . All rights reserved.
No reproduction, distribution, publishing or transmission of the copyrighted materials at this site is permitted. Policy
go to top