50

I have Samsung SSDs on my own laptop and on some servers.

When I do:

smartctl -a /dev/sda | grep 177

I get results that I cannot understand. Here are some examples:

# my laptop Samsung SSD 850 EVO 500GB (new)
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
177 Wear_Leveling_Count     0x0013   100   100   000    Pre-fail  Always       -       0

# server 256 GB, SAMSUNG MZ7TE256HMHP-00000
177 Wear_Leveling_Count     0x0013   095   095   000    Pre-fail  Always       -       95

# server 512 GB, SAMSUNG MZ7TE512HMHP-00000 (1 year old)
177 Wear_Leveling_Count     0x0013   099   099   000    Pre-fail  Always       -       99

# server 512 GB, SAMSUNG MZ7TE512HMHP-00000 (suppose to be new)
177 Wear_Leveling_Count     0x0013   099   099   000    Pre-fail  Always       -       99

# server 480 GB, SAMSUNG MZ7KM480HAHP-0E005
177 Wear_Leveling_Count     0x0013   099   099   005    Pre-fail  Always       -       3

# server 240 GB, SAMSUNG MZ7KM240HAGR-0E005
177 Wear_Leveling_Count     0x0013   099   099   005    Pre-fail  Always       -       11

Any idea how to read Wear_Leveling_Count?

Some values are at the minimum, some are at the maximum.

If consider "laptop" Samsung SSD 850 EVO 500GB, it is 0 and probably will go to 100, then will fail.

If consider first "server" 256 GB, SAMSUNG MZ7TE256HMHP-00000, it is already at the maximum? Will it go down to zero?

Chenmunka
  • 3,228
  • 13
  • 29
  • 38
Nick
  • 703
  • 1
  • 5
  • 12

4 Answers4

63

Kingston describe this SMART attribute as follows:

Number of erase/program cycles per block on average. This attribute is intended to be an indicator of imminent wear-out. Normalized Equation: 100 – ( 100 * Average Erase Count / NAND max rated number of erase cycles)

Ignore the Raw Data in these instances (They can be manipulated by manufacturers to work in different ways), and look at the Current Value column.

This source from Anandtech gives us a good indication of how to use this figure:

The Wear Leveling Count (WLC) SMART value gives us all the data we need. The current value stands for the remaining endurance of the drive in percentage, meaning that it starts from 100 and decreases linearly as the drive is written to. The raw WLC value counts the consumed P/E cycles, so if these two values are monitored while writing to the drive, sooner than later we will find the spot where the normalized value drops by one.

All of your drives are at between 95 and 100, and will eventually drop to 0. This is an estimation of how many write, erase, rewrite etc. cycles each block can go through before failing, and at the moment, one of your drives is estimated to have used 5% of it's current expected life span. Again, the key word here is estimated.

Note also that your drives may use different NAND technology, hence the differences in perceived life. Some NAND technology expects blocks to last for around 1000 PE cycles each, others can be rated for as much as 30,000.

Jonno
  • 21,049
  • 4
  • 61
  • 70
  • 1
    I attached the table "header". What is "current" value? is it "VALUE" column? – Nick Feb 09 '16 at 17:36
  • 2
    @Nick Yes, exactly. – Jonno Feb 09 '16 at 17:42
  • 1
    That's the exact opposite of my experience. My new drives (Samsung 850 Pro, Samsung 840 Pro) had started at a Raw Value of 0 and went up from there. In fact my current 840 Pro was at 97 about a month ago, and it's now at 99. (This is from looking at SMART data through the Samsung Magician software.) – Granger Jan 09 '17 at 15:13
  • 5
    @Granger Do you have a 'Value' or 'Current' column? Raw values are typically up to the OEM to decide what they do with, and aren't necessarily legible data. Notice in the example the OP provided, the 'VALUE' is 100, and 'RAW_VALUE' is 0 for their 850 EVO. – Jonno Jan 09 '17 at 16:24
  • 2
    Ah. That makes more sense if I completely ignore the "Raw Value" column. – Granger Jan 09 '17 at 18:21
  • 1
    So it turns out gnome-disk-utility reports raw value as "Value" and value as "Normalized" – Rodney Jul 20 '17 at 08:53
  • I have a two year old Samsung SSD 850 PRO and I have a Wear Leveling Count of `098` on value and `118` on raw value. Is that bad? – casolorz Dec 04 '17 at 16:37
  • @casolorz Far from it, you've used 2% of the anticipated life of your drive. Enjoy another potential 98 years of use ;) (Note that I say that in jest, of course these are just approximations) – Jonno Dec 08 '17 at 23:18
  • On my Samsung SSD 840 EVO 250GB, Wear_Leveling_Count is 43 on a not heavily used SSD after the final firmware update to fix slow speed. It has definitely speeded up wearing. – sdaffa23fdsf Dec 25 '17 at 11:08
  • Samsung SSD 850 EVO 500GB, 11 months of usage, Wear_Leveling_Count - 061. It seems it wears out quite fast. – Tom Raganowicz Oct 09 '18 at 07:27
  • @sdaffa23fdsf the 840 EVO fixed its problem by writing / updating cells if I recall correctly. I'm not surprised that the wear_leveling_count is so poor with the 840 evo. I had that drive for two or three years. I feel your pain. – D-Klotz Apr 26 '19 at 19:54
  • 1
    Just an FYI... as we're doing a fair amount of research into this. Our SSDs were warrantied to 500TB written. Wear leveling count went to 0 when we reached that. We are now currently at around 2.5PB written and we've still got no bad blocks or reallocations at all. I suspect that this number is pretty arbitrary, and is simply there to make people buy new SSDs earlier than they need to. – Reverend Tim Aug 14 '19 at 08:16
  • @ReverendTim Yes, there's really no way to know for sure, just using fairly meaningless estimated values. Would be interested to see your results as and when you have any if they're being made public. – Jonno Aug 14 '19 at 08:58
  • @Jonno they will indeed. A colleague of mine will be publishing a blog about it once we've blown up all the drives :) – Reverend Tim Aug 15 '19 at 09:54
  • We've got a bunch of old 840 EVO's here that all have a VALUE of 001, but still appear to be working. YMMV. – Mike Andrews Oct 02 '19 at 21:57
  • @Rodney not so: `gnome-disks` reports 177's `Normalized` as `N/A` and `smartctl` reports 177's `VALUE` as `000` on my device under ubuntu 18.04. "N/A" is NOT "000". – Eugene Gr. Philippov Dec 05 '20 at 12:32
  • @EugeneGr.Philippov just checked 20.04 and what I said is still true EXCEPT for some fields (including Wear Levelling) where `gnome-disks` reports `N/A` for `Value` (but not for `Normalized` which still matches what `Smartctl` reports as `Value`. What has caused the N/A to appear I don't know but the KDE partition manager reports the same (N/A). I have changed both my OS and my device since I wrote that original comment. – Rodney Dec 05 '20 at 15:44
6

SMART reports a PREFAILED condition for my Samsung SM951 (AHCI) 128GB, reported in Linux as SAMSUNG MZHPV128HDGM-00000 (BXW2500Q).

But in my case I think it's a firmware bug of the drive,

  • because the total-bytes-written property is reported as 1.1TB while the drive has a specified Total Bytes Written (TBW) of 75TB! Which probably is on the (very) save side, because similar (MLC NAND) drives all reached a multitude of that (600TB) in a real endurance test,
  • and apart from the wear_level_count warning no other prefail or oldage errors or warnings are reported,
  • while the reallocated-sector-count, which according to that test is good pre-fail indicator, is still 0.

So my advise would be to examine those values for your drive/system and base your conclusions on that.

I prefer the low level utility skdump which is supplied with libatasmart, the same library that is used by Gnome Disks.

Use the following command, replacing /dev/sdc with the path to your block device:

sudo skdump /dev/sdc

jww
  • 11,918
  • 44
  • 119
  • 208
Ronald
  • 61
  • 1
  • 1
  • 4
    Note that if a row is listed as "PREFAIL" that does not mean the attribute **has** "pre-failed", but rather that it serves as a potential pre-failure metric (so if it is in a poor state, e.g. low normalized VALUE, your drive is likely to fail soon). – Doktor J Apr 30 '20 at 14:05
3

Short note on Samsung EVO and PRO SSD:

smartctl -a /dev/sda

smartctl 6.4 2014-10-07 r4002 [x86_64-linux-4.9.0-0.bpo.6-amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     Samsung SSD 860 PRO 1TB
Serial Number:    S42NNF0K000000
LU WWN Device Id: 5 002538 e405145c6
Firmware Version: RVM01B6Q
User Capacity:    1,024,209,543,168 bytes [1.02 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   Unknown(0x09fc) (unknown minor revision code: 0x005e)
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Fri Jan  8 11:53:56 2021 EET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (    0) seconds.
Offline data collection
capabilities:            (0x53) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    (  85) minutes.
SCT capabilities:          (0x003d) SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   097   097   000    Old_age   Always       -       14689
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       122
177 Wear_Leveling_Count     0x0013   098   098   000    Pre-fail  Always       -       25
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail  Always       -       0
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0032   067   056   000    Old_age   Always       -       33
195 Hardware_ECC_Recovered  0x001a   200   200   000    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x003e   099   099   000    Old_age   Always       -       23
235 Unknown_Attribute       0x0012   099   099   000    Old_age   Always       -       58
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       29068641040

So most interesting part of lifetime indicator is:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
177 Wear_Leveling_Count     0x0013   098   098   000    Pre-fail  Always       -       25

25 for RAW value??? Does that mean, I have 25 percent of lifetime depleted?

Really no. Please see what Samsung wrote:

SMART attribute 177 (Wear Leveling Count)

This attribute represents the number of media program and erase operations (the number of times a block has been erased). This value is directly related to the lifetime of the SSD. The raw value of this attribute shows the total count of P/E Cycles.

This means, that in my particular SSD VALUE 98 shows still 98 percent of lifetime is remaining, but average Program/Erase cycles per block is 25 times.

One more interesting thing:

241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       29068641040

What is that size in GB? TB?

Very simple. Take SMART information to get Sector size:

Sector Size:      **512** bytes logical/physical

Total Gigabytes or Terabytes Written:

29068641040/2/1024/1024 = 13861 GB / 1024 = 13.536 TB

Explanation: divide LBA count by 2, because 1 KB consists from 2 512B sectors. Then divide by 1024 to get MB, GB and TB.

Hope it helps.

Arunas Bartisius
  • 1,480
  • 14
  • 19
-2

I have always just scheduled an image of my drives daily. Some rigs Veem, others StorageCraft. With bare metal and VM restore/mounts in most cases under 5 minutes for mounts, I have yet to be caught with my pants down.

In addition to that, if you really want to have a plan in place, plan on replacing all drives within 30 days of warranty expiration.

I do respect the math and the wanting to know the particulars in how/when drive failure can be monitored or predicted and I tip my hat to all of you on the technical side figuring out the number-crunching!!

Jw P.
  • 1
  • 1
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jan 07 '22 at 15:00