158

My SATA drive started clicking and I was unable to access the data. It was not clicking loudly though, like a drive that has already gone bad. After tightening the connections to the hard drive, it stopped clicking and I was able to access the data again. I have started to move files off of the drive, but I think this drive might still be in good health. I didn't find any data corruption and I haven't had any trouble accessing any files. I have never had an SATA drive fail before so I'm thinking that it could have just been the loose connections that was causing the problem. What tests can I run on this drive to find out how healthy it is?

This is the hard drive in question: HITACHI Deskstar T7K250 HDT722525DLA380 (0A31636) 250GB 7200 RPM 8MB Cache SATA 3.0Gb/s 3.5" Hard Drive -Bare Drive

Mokubai
  • 89,133
  • 25
  • 207
  • 233
tony_sid
  • 14,001
  • 51
  • 139
  • 194
  • 1
    Oh, when I answered you hadn't mentioned that it was a deathstar. At least some of the Deskstar line has a very bad reputation for longevity and reliability. Bad enough that the failing drives are termed "deathstar". – Slartibartfast Aug 04 '10 at 02:02

13 Answers13

151
sudo smartctl -a /dev/sda | less

This will give you an abundance of information about your hard drive's health. The tool also permits you to start and monitor self tests of the drive.

If you want to do benchmarks / check all of the sectors to find one that is bad, you can find other tools for that, but smartctl is the first place to go for drive health status.

Slartibartfast
  • 7,978
  • 2
  • 25
  • 27
  • 7
    And Palimpsest (aka gnome-disk-utility) is a slick GUI app that gives the same info. – Marius Gedminas Aug 03 '10 at 09:28
  • 2
    palimpsest is notorious for often giving false positives. – vtest Sep 04 '10 at 03:41
  • 8
    @vtest citation required – mgalgs Nov 05 '14 at 18:32
  • 72
    For anyone who finds they don't have `smartctl`: it's probably under your package manager as "smartmontools". – Praxeolitic Jun 28 '15 at 17:56
  • 1
    On my HP server I had to run `smartctl -a -d cciss,0 /dev/cciss/c0d0p1 | less` with RAID – trail_runner Feb 22 '16 at 22:25
  • 7
    `sudo apt-get install smartmontools` on Ubuntu 14 – mrgloom Oct 21 '17 at 13:06
  • 2
    But what output from this tool to look at? `Vendor Specific SMART Attributes with Thresholds`? In my case some of them are `Pre-fail` and some `Old_age`, so as I understand overall disk health can be considered as bad. – mrgloom Oct 21 '17 at 13:08
  • If you're using a USB adapter, check https://www.smartmontools.org/wiki/Supported_USB-Devices for the device type. – timelmer Apr 12 '18 at 01:59
  • Can I run smartctl on a disk with mounted partitions? – Josir Aug 14 '18 at 21:22
  • 1
    @Josir: Largely, yes. Most smartctl commands happen at a low level, and are passive to the normal function of the device. The man page should clearly identify the exceptions to this, so you should read it. – Slartibartfast Aug 17 '18 at 05:06
  • 1
    If your device appears greyed out in Gnome disk utility and `smartctl -a ` reports `Unknown USB bridge [some memory or what hex]` try to manually go through the list of device types (`man smartctl` -> `/ --device`). For mine it was `--device=sat --all /dev/sdc`). Also check [this ticket](https://www.smartmontools.org/ticket/280) – KeyWeeUsr Jul 02 '19 at 11:02
  • I wish there was something more compact for us non-experts. I have 6 disks and just want to know relative to each other which one should I replace soonest. I'll have to look for something else because I can't find a discriminant field in this output. – Sridhar Sarnobat Sep 02 '22 at 04:45
  • 1
    @SridharSarnobat, I would suggest, if you can't find any other indicator, Power On Hours. If one runs hotter, I'd swap that one out first. Or the one that has more errors/power-on-hours, or ... Absent a reason to swap out a drive, I'd tend to avoid the hassle. – Slartibartfast Sep 08 '22 at 12:03
  • Many drives plater drives may fails due to mechanical malfunction. (most of mine). Clicking sound derived usually from misalignment of the heads. Then i may suggest to produce a new statistic that somehow can detect the misalignment, or even a way to realign the heads using software? I find that (a tool to realign hdd heads) https://www.secureindia.in/?page_id=441 – Estatistics Aug 13 '23 at 11:33
80

badblocks is one more useful utility; it shows the amount and location of bad blocks on your drive. Above is an example with an ongoing progress of currently scanned device:

sudo badblocks -v /dev/sda -s
Erkko
  • 3
  • 2
mi988
  • 901
  • 6
  • 4
  • 2
    what is the link with a possibly hardware failing hard drive? – tuk0z Oct 05 '15 at 01:56
  • 4
    @lliseil Question is *How to check the health of a hard drive* – Emmanuel Mar 04 '16 at 12:21
  • `pacman -S e2fsprogs` on arch – oddRaven Oct 01 '17 at 21:41
  • @Emmanuel this checks the health of a hard drive... SMART only passively reports, badblocks checks all sectors and bad ones will then show up on smart. Perfect for detecting new hard disks likely to prematurely die. – Ray Foss Feb 13 '18 at 18:50
  • @RayFoss Can `badblock` be used to tell when the disk must be replaced before a failure occurs ? – Emmanuel Feb 16 '18 at 18:15
  • 5
    @Emmanuel Yes... but at the cost of increased wear. For example, Seagate Surveillance drives are rated for around 180TB/year. Doing badblocks on a 10TB one will transfer 80TB of data. It really makes sense to do it before you start using it. If a block is particularly bad there is a good chance running badblocks in read only mode will trip the badblock and it'll get reported on smart... Also, badblocks takes ~96 hours to run on a WD Red 8TB, which is kind of annoying, especially if you lose power and aren't sure where you left off. – Ray Foss Feb 16 '18 at 18:55
  • 2
    so `badblocks -v` seems to report bad block "numbers", one per line, e.g. `37754169`, `37754170` ... . Knowing the bad blocks, what can then be done? – Abdull Oct 08 '22 at 10:42
  • @Abdull my new understanding is that you take the output of badblocks and feed it to a filesystem check to mark the blocks as bad for future reference. In the man page of `badblocks` it suggests that if you're going to send the output to `e2fsck` or `mke2fs`, you might use those programs with the `-c` switch instead. There is also `ddrescue` to attempt to recover those blocks. – Johann Jul 26 '23 at 04:32
19

I see that no one has mentioned gsmartcontrol which is a GUI.

In Ubuntu you can install it with $ sudo apt-get install gsmartcontrol

If you launch sudo gsmartcontrol you see all the hard drives in your computer.

Then if you right click on a device and click View Details you see something like this.

You can get a lot of details in the different tabs here. You can also perform tests in the Perform Tests tab.

GSmartControl

Dan Dascalescu
  • 3,769
  • 6
  • 35
  • 53
user3620828
  • 311
  • 3
  • 6
17

If a HD starts to give you physical hints about an upcoming failure, no software will help. Yes, SMART exists and things like smartctl can read its results for you, but you shouldn't bet on it. SMART can be useful for detecting things like high temperatures or bad sectors, but if your HD starts to click or does not start up during the first try, it's time to

  • make sure you have backups
  • rush to nearest computer dealer, buy a new HD and copy everything there

When HD decides to fail, it will do it without a previous warning and Murphy's law says that the failure will happen during the most unwanted moment. So be prepared and backup & replace the disk NOW rather than waiting for the catastrophe.

Janne Pikkarainen
  • 7,715
  • 1
  • 31
  • 32
5

Try using SpinRite (It isn't free) but I have used many, many tools. Most tools make more damage than help, when I say damage, I mean "not taking good care of your information". This tool will check your drive and fix the bad sectors, while moving your information to secure sectors. It also is a preventing method for hard disk catastrophes

I strongly suggest risking on buying a fully tested product with a good background, than losing your so valuable information.

Mario
  • 244
  • 1
  • 6
4

Test environment: Permanent Live Ubuntu 16.04 USB made based on the thread How to Make Persistent Live Ubuntu of 16.04? Connect your HDD on your computer. Boot to the live Ubuntu. The GUI program gnome-disks which shows also bad-sectors and where you can do benchmarking of the discs and its different sectors. It is similar to the tools of smartmontools for sudo smartctl -a .... Example output of benchmarking my 500 GB disc where you see the read/write speed degenerates in time under heavy load

enter image description here

Other view: SMART Data & Self-Tests where I run short self-test. You can find temperature of the drive, and how many years/months/days your drive has had power on

enter image description here

Léo Léopold Hertz 준영
  • 5,686
  • 12
  • 68
  • 115
  • Any idea why the "Smart Data and Self-Tests..." menu is disabled in `sudo gnome-disks` for disks that do have SMART (as shown by [`gsmartcontrol`](https://superuser.com/questions/171195/how-to-check-the-health-of-a-hard-drive/1304190#1304190))? – Dan Dascalescu Feb 17 '19 at 08:29
4

Output of smartctl is hard to read for me. gnome-disks pulls in GNOME which nowadays cannot live without NetworkManager.

I found skdump(part of libatasmart) which I able to understand. It produce also "Pretty" and "Good" columns alongside with Overall status:

Bad Sectors: 0 sectors
Powered On: 7.4 years
Power Cycles: 2144
Average Powered On Per Power Cycle: 1.3 days
Temperature: 33.0 C
Attribute Parsing Verification: Good
Overall Status: GOOD
ID# Name                        Value Worst Thres Pretty      Raw            Type    Updates Good Good/Past
  1 raw-read-error-rate         100    91    51   36          0x240000000000 prefail online  yes  yes 
  3 spin-up-time                 76    76    11   8.0 s       0x181f00000000 prefail online  yes  yes 
  4 start-stop-count             98    98     0   2173        0x7d0800000000 old-age online  n/a  n/a 
  5 reallocated-sector-count    100   100    10   0 sectors   0x000000000000 prefail online  yes  yes 
  7 seek-error-rate             100   100    51   0           0x000000000000 prefail online  yes  yes 
  8 seek-time-performance       100   100    15   n/a         0x072700000000 prefail offline yes  yes 
  9 power-on-hours               87    87     0   7.4 years   0xd1fd00000000 old-age online  n/a  n/a 
 10 spin-retry-count            100   100    51   0           0x000000000000 prefail online  yes  yes 
 11 calibration-retry-count     100   100     0   0           0x000000000000 old-age online  n/a  n/a 
 12 power-cycle-count            98    98     0   2144        0x600800000000 old-age online  n/a  n/a 
 13 read-soft-error-rate        100    91     0   36          0x240000000000 old-age online  n/a  n/a 
183 runtime-bad-block-total     100   100     0   0           0x000000000000 old-age online  n/a  n/a 
184 end-to-end-error            100   100     0   0           0x000000000000 prefail online  n/a  n/a 
187 reported-uncorrect          100   100     0   2540 sectors 0xec0900000000 old-age online  n/a  n/a 
188 command-timeout             100   100     0   0           0x000000000000 old-age online  n/a  n/a 
190 airflow-temperature-celsius  67    53     0   33.0 C      0x21000f210000 old-age online  n/a  n/a 
194 temperature-celsius-2        67    52     0   33.0 C      0x21000f220000 old-age online  n/a  n/a 
195 hardware-ecc-recovered      100   100     0   47099       0xfbb700000000 old-age online  n/a  n/a 
196 reallocated-event-count     100   100     0   0           0x000000000000 old-age online  n/a  n/a 
197 current-pending-sector      100   100     0   0 sectors   0x000000000000 old-age online  n/a  n/a 
198 offline-uncorrectable       100   100     0   0 sectors   0x000000000000 old-age offline n/a  n/a 
199 udma-crc-error-count        100   100     0   0           0x000000000000 old-age online  n/a  n/a 
200 multi-zone-error-rate       100   100     0   0           0x000000000000 old-age online  n/a  n/a 
201 soft-read-error-rate        100   100     0   0           0x000000000000 old-age online  n/a  n/a 

Though it states "GOOD" (Samsung HD103UJ). In output of smartctl I saw log with errors and you can see them under 187 (uncorrected errors) which indicates how much data I really lost. Seeing 7 (reallocated sectors) being at 0 is a bit unexpected for me.

ony
  • 211
  • 1
  • 5
4

Besides the already mentioned SMART status it might be important to mention that modern HDDs tend not to fail gracefully. Often from one day to the next you only hear a clicking sound or can't access the disk at all. So while your problem could also be caused by a loose cable be always prepared by having regular backups on a different disk.

Alexander
  • 41
  • 1
1

HDDScan is a very handy/useful utility for scanning HDDs. It'll show any error most likely. However, you should also try vendor specific tools. (If you tell me your HDD's manufacturers (and model) I can link them here.)

Apache
  • 15,981
  • 25
  • 100
  • 152
  • 2
    Posted above. HDDScan looks like a good tool, but is there something like that for Linux? – tony_sid Aug 02 '10 at 23:56
  • Well.. You didn't add Linux tag, nor what kind of architecture, which package based, etc. You can scan your harddrive with "e2fsck". Try typing "man fsck" / "man e2fsck" or "e2fsck --help" into the console and you'll see how to use it. – Apache Aug 03 '10 at 06:02
  • 3
    e2fsck stands for *filesystem* check. – tuk0z Oct 05 '15 at 01:53
1

http://en.wikipedia.org/wiki/S.M.A.R.T.

S.M.A.R.T. is a set standard for what you're describing. There are various applications out there to get the information from the HDD.

My favorite (and free) choice is SpeedFan.

Nitrodist
  • 1,608
  • 2
  • 13
  • 25
0

If the question is:

Which software will warn me when my drive is about to fail?

The answer is none for most cases. Most drives break in a very short period of time, and neither SMART or any other software catches them on time.

Also even if they report the error, a bad sector data can be unrecoverable.

So the real solution to data lost is backups. I really like Synthing for that, with qsyncthingtray, as it makes perfect clones on all devices on the fly.

0

HDTune, the free version can check for HDD health.

Qwerty
  • 1,759
  • 14
  • 17
-7

You are on Linux but you can attach your HDD to a friend's computer running Windows.

You don't need any complicated software to check HDD health. Use Crystal Disk Info for Windows to check if your HDD is in good condition or if there is any damage.

It will also show the S.M.A.R.T data with an indicator beside each value so if you find a red indicator then there is a problem with your hard drive.

slhck
  • 223,558
  • 70
  • 607
  • 592
  • 35
    You *do* realise there's good linux native SMART software right? – Journeyman Geek Jun 06 '12 at 12:25
  • I have also used this utility, and have not found it to surface test the drive / search for bad blocks/sectors, even after looking through its Advanced Functions. While talking about Windows though, and just to throw more terms on to the page that can be quickly searched, I have used MiniTool Partition Wizard Free to surface test. I don't think HDDRegenerator has this feature, and only reads S.M.A.R.T. data like CDI. – Pysis Mar 15 '18 at 14:03