Questions tagged [smart]

Self-Monitoring, Analysis and Reporting Technology

Self-Monitoring, Analysis and Reporting Technology

This used to monitor a hard drive's state and reliability. It tries to predict failures and warns the user when a disk is degrading.

207 questions
3
votes
1 answer

How to check S.M.A.R.T. HDD Status in CoreOS

How to check the status of the hard discs (S.M.A.R.T.) in the CoreOS? smartd and smartctl are not part of the CoreOS. So following CoreOS philosophy smartd would run inside of the container, as a systemd unit and smartctl would be used from…
3
votes
1 answer

SMART self tests - worthwhile if already doing weekly RAID checks and smartctl -a output being monitored?

If a server is configured to check the RAID weekly using /usr/sbin/raid-check and the output of smartctl -a is being monitored, is it worthwhile to also have regular SMART short and long self tests be configured to run, or would that be overkill? …
sa289
  • 1,318
  • 2
  • 18
  • 44
3
votes
1 answer

How can a SMART normalised value be lower (worse) than the worst value?

I'm seeing output from smartctl where the VALUE is much less than WORST for some attributes. Does this make sense? What does it mean? Everything I have read indicates that: The raw value (RAW_VALUE in smartctl output) is manufacturer specific but…
Draemon
  • 527
  • 1
  • 5
  • 15
3
votes
1 answer

"failed command: WRITE DMA" but SMART test says hard disk is OK

Recently I started receiving (in dmesg) such errors: [ 1569.944985] ata6.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 [ 1569.944991] ata6.01: BMDMA stat 0x44 [ 1569.944994] ata6.01: failed command: WRITE DMA [ 1569.944998] ata6.01: cmd…
user983447
  • 1,127
  • 1
  • 10
  • 10
3
votes
1 answer

When does a Raid restore redundancy after a broken sector is flagged as defective?

What happens when I flag a sector on a hdd in a RAID setup as `defective (GLIST) ? Will the data be written to the replacement sector right away or does this depend on the actual setup/settings (soft/hardware raid)? Example: Raid 5 - 4 Drives -…
Benedikt Haug
  • 111
  • 1
  • 1
  • 5
3
votes
2 answers

Why does the temperature in a pair of disks under RAID 1 showing a large temperature difference?

It appears quite strange to me that temperature readings derived from SMART for one disk is different from its twin in a RAID 1 configuration by as much as 9°C: # smartctl -d scsi -A /dev/sg1 === START OF READ SMART DATA SECTION === Current Drive…
Question Overflow
  • 2,103
  • 7
  • 30
  • 45
3
votes
2 answers

I/O Errors but no smart or ZFS errors

I'm having trouble to identify a problem for a friend of my. He is running ZFS on Linux with the Debian distribution. We are getting these entries into the dmesg. [273044.834151] mpt2sas0: log_info(0x31110d00): originator(PL), code(0x11),…
drsect0r
  • 31
  • 3
3
votes
1 answer

PERC H710p SMART data

I'm trying to read the SMART data from some harddisks attached to a perc H710p but neither ESXI nor the idrac7 will give me specifics. The idrac will display a status for the individual drives but not the actual smart values. Does anyone know how I…
user207138
  • 31
  • 1
  • 2
3
votes
3 answers

Is there a point where harddrive age (Power On Hours) necessitates replacement?

I have several storage arrays where a significant number of the drives have been powered on between 25,000 - 30,000 hours (2.8 - 3.4 years). These drives have no other issues or errors. What I want to know: is there a point where drive age alone is…
jlehtinen
  • 1,958
  • 2
  • 13
  • 15
3
votes
1 answer

How do I check if smartd and mdadm are running correctly?

I have a raid system on debian: Disk /dev/sda: 320.1 GB,... Device Boot Start End Blocks Id System /dev/sda1 * 1 2432 19535008+ fd Linux raid autodetect /dev/sda2 2433 2918 …
rubo77
  • 2,469
  • 4
  • 34
  • 66
3
votes
2 answers

smartctl reports Current Pending Sectors as 2, but a long test doesn't find any errors

The SMART attributes on the drive are: SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1…
thelsdj
  • 830
  • 1
  • 12
  • 25
3
votes
2 answers

Can't get any SMART or temperature data from HDDs

I have a PC with recently installed Gigabyte GA-X79-UD5 MB. I've encountered some weird problem with getting SMART data or temperature for HDDs. Every single tool I've tried in Windows 7 just can't get any data (HDTune, AIDA64...). I was suspecting…
Regs
  • 177
  • 1
  • 7
3
votes
1 answer

smart Raw_Read_Error_Rate

Need experts opinions. Have disk. Here is smart output of Seagate "Barracuda 7200.10 family": ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 096 089 006 …
user52475
  • 41
  • 1
  • 2
  • 3
3
votes
1 answer

smartctl -t long isn't finishing

I been running smartctl -t long on a drive for about 2 days now and it seems to be stalled at 10%. short and conveyance both passed. I have to send 1 of 2 drives purchased back I found badblocks with badblocks (none on this drive and I'ts made over…
xenoterracide
  • 1,496
  • 2
  • 13
  • 26
3
votes
2 answers

Health Tests on NVMe

On the servers I have, with HDD or SSD, I have a cron that periodically runs: /usr/sbin/smartctl --test=short/long /dev/sd1 (for each disk) While it runs, it just looks at the output of /usr/sbin/smartctl -c /dev/sd1, looping until it no longer…
Nuno
  • 553
  • 2
  • 8
  • 26