2

According to Wikipedia :

S.M.A.R.T. (Self-Monitoring, Analysis and Reporting Technology; often written as SMART) is a monitoring system included in computer hard disk drives (HDDs), solid-state drives (SSDs), and eMMC drives. Its primary function is to detect and report various indicators of drive reliability with the intent of anticipating imminent hardware failures.

Question

Why is S.M.A.R.T read-only for users and why don't manufacturers allow us to easily reset it? Why are the errors incremented and never reset?

Example

A friend once had a problem with his SATA cable. A tool that reads S.M.A.R.T showed that the HDD had 'n' cable communication failures. He replaced the cable. He opened the tool again, and the same amount of failures remains. Why isn't it automatically reset? Why did the standard decide that it should work like this? Is there any special reason?

Context

A group of IT friends and I were arguing about why SMART works the way it does. The fact it is not resettable or easily resettable brought questions as to why and criticism that it post-repair procedures difficult. Another friend argues that it might be a security measure to prevent old HDDs to be sold as new.

References:

https://qastack.com.br/ubuntu/342976/how-to-reset-smart-results

https://forum.avast.com/index.php?topic=62124.0

https://forum.hddguru.com/viewtopic.php?t=36754

https://en.wikipedia.org/wiki/S.M.A.R.T.

Passerby
  • 72,580
  • 7
  • 90
  • 202
  • 7
    This is off topic, but it's probably for the same reason you can't push a button and reset the odometer in a used car. I once worked a school that had contracted with a company to maintain the school computers. There were several times they installed hard drives that SMART said were bad If SMART were user writable, we wouldn't have been able to see that the drives were used and bad. When SMART tells you the drive is bad, it means it. Replace it. Do not try to "fix" it and hope it'll be OK. It won't. It'll eat your data for lunch then belch in your face. – JRE Jan 24 '22 at 20:30

1 Answers1

4

Your assumption "Its primary function is to detect and report various indicators of drive reliability with the intent of anticipating imminent hardware failures." is my doubt.

My experience is that HDD failures occur these days with no warning. Once embedded servo is damaged such as an unintended jar or flip from vertical portable unit to horizontal whack, while operating, there is the "click of death".

Margins in high density 8 TB drives are so small that the physical size is smaller than the smallest transistor junction and smaller than virus particles which are much smaller than bacteria particles. Even though the designs are incredibly robust with safe landing zones or retraction methods, very high power ECC correction codes, embedded HEPA filters, spare sectors relocated in use on excessive soft errors, it is my belief that S.M.A.R.T. error codes are first for traceability for the supplier warranty defects and quality control and second for customer warnings if or when the OS or 3rd party software detects and reports the impending doom to the observant user who knows how to be alerted of these parameters, that it is time to backup or replace.

Although I have extensive HDD qualification test experience from the '80's, my experience in the last 20 yrs, is that SMART is a history lesson in learning to avoid high temperatures > 45'C and portable drive physical abuse. Even exposed to the highest technology in other fields, I don't know any product as sophisticated in physical tolerances and electromagnetic memory and rapid servo capabilities as HDD's.

Tony Stewart EE75
  • 1
  • 3
  • 54
  • 182