SMART data & Self tests, not sure if my SSD is on it's last gasp
Bruce Labitt
bruce.labitt at myfairpoint.net
Sat Jan 2 22:14:08 EST 2021
Think it's a driver issue. Looked in journalctl and there's some errors
indicated. One is a video issue, another is some sort of permissions
issue for user who isn't me. The permissions issue is with
tracker-miner, which I find to be highly annoying. Not quite sure how
to disable it cleanly with low system impact.
Last fsck was 3 months ago. Next one is due in 3 months. So it wasn't
an overdue fsck... So I'm not so sure it's disk related at all.
Have contacted system76 and sent them logs. If I recall correctly, the
issue seems to be closely related to a driver change (issued by
system76). Of course, they are still on break...
Nonetheless, waiting 8-10 minutes for boot is awful. I don't even think
my first IBM PC was that slow, even with a boot from floppy disk.
On 1/2/21 9:15 PM, r270 at mrt4.com wrote:
> Examine the time stamps on the syslog and compare them to previous nominal boots. That should indicate where the issue is. If all log entries indicate long delays, then it is something systemic like memory, storage, CPU, a thermal issue, etc. (Note: A systemic issue is not necessarily a hardware fault because a HW device can be incorrectly configured when it is initialized.)
>
> If it was a one-time occurrence then it was most likely an overdue fsck, but syslog will indicate that if that's the case.
>
> Ronald Smith
>
> --------------------------
>
> On Wed, 30 Dec 2020 14:04:43 -0500
> Bruce Labitt <bruce.labitt at myfairpoint.net> wrote:
>
>> I think I have a SSD on the way out. Last reboot took a REALLY long
>> time. Like 30 minutes. I ran the smart data and self test and the SSD
>> passes. Overall assessment is disk is ok. I really don't know how to
>> interpret what the results are.
>>
>> I think the disk is in pre-fail based on the smartctl output below
>>
>> /snip
>>
>> === START OF INFORMATION SECTION ===
>> Model Family: Crucial/Micron RealSSD m4/C400/P400
>> Device Model: M4-CT256M4SSD2
>> Serial Number: 000000001247091DC2FF
>> LU WWN Device Id: 5 00a075 1091dc2ff
>> Firmware Version: 040H
>> User Capacity: 256,060,514,304 bytes [256 GB]
>> Sector Size: 512 bytes logical/physical
>> Rotation Rate: Solid State Device
>> Form Factor: 2.5 inches
>> Device is: In smartctl database [for details use: -P show]
>> ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 6
>> SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>> Local Time is: Wed Dec 30 13:49:17 2020 EST
>> SMART support is: Available - device has SMART capability.
>> SMART support is: Enabled
>>
>> === START OF READ SMART DATA SECTION ===
>> SMART overall-health self-assessment test result: PASSED
>>
>> /snip
>>
>> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
>> UPDATED WHEN_FAILED RAW_VALUE
>> 1 Raw_Read_Error_Rate 0x002f 100 100 050 Pre-fail
>> Always - 0
>> 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail
>> Always - 0
>> 9 Power_On_Hours 0x0032 100 100 001 Old_age
>> Always - 7294
>> 12 Power_Cycle_Count 0x0032 100 100 001 Old_age
>> Always - 2511
>> 170 Grown_Failing_Block_Ct 0x0033 100 100 010 Pre-fail
>> Always - 0
>> 171 Program_Fail_Count 0x0032 100 100 001 Old_age
>> Always - 0
>> 172 Erase_Fail_Count 0x0032 100 100 001 Old_age
>> Always - 0
>> 173 Wear_Leveling_Count 0x0033 098 098 010 Pre-fail
>> Always - 66
>> 174 Unexpect_Power_Loss_Ct 0x0032 100 100 001 Old_age
>> Always - 87
>> 181 Non4k_Aligned_Access 0x0022 100 100 001 Old_age
>> Always - 10250 5047 5203
>> 183 SATA_Iface_Downshift 0x0032 100 100 001 Old_age
>> Always - 0
>> 184 End-to-End_Error 0x0033 100 100 050 Pre-fail
>> Always - 0
>> 187 Reported_Uncorrect 0x0032 100 100 001 Old_age
>> Always - 0
>> 188 Command_Timeout 0x0032 100 100 001 Old_age
>> Always - 0
>> 189 Factory_Bad_Block_Ct 0x000e 100 100 001 Old_age
>> Always - 81
>> 194 Temperature_Celsius 0x0022 100 100 000 Old_age
>> Always - 0
>> 195 Hardware_ECC_Recovered 0x003a 100 100 001 Old_age
>> Always - 0
>> 196 Reallocated_Event_Count 0x0032 100 100 001 Old_age
>> Always - 0
>> 197 Current_Pending_Sector 0x0032 100 100 001 Old_age
>> Always - 0
>> 198 Offline_Uncorrectable 0x0030 100 100 001 Old_age
>> Offline - 0
>> 199 UDMA_CRC_Error_Count 0x0032 100 100 001 Old_age
>> Always - 0
>> 202 Perc_Rated_Life_Used 0x0018 098 098 001 Old_age
>> Offline - 2
>> 206 Write_Error_Rate 0x000e 100 100 001 Old_age
>> Always - 0
>>
>> Replace the disk pronto? Is that what this is telling me? Or?
>>
>> I recently copied over many important files to another disk. And
>> downloaded a new OS. I just hate re-configuring things, and starting
>> from scratch, it's such a pain. Not as painful as a disk crash, but
>> close. I've got loads of stuff I've compiled from source and just 100's
>> of things to check or update. Yes, I'll just have to do it. It's just
>> the week plus of recovery that I'm rebelling against.
>>
>> Anything else I should do first? Check something? Run a test? Any tips
>> to make the "recovery" less painful?
>>
>> _______________________________________________
>> gnhlug-discuss mailing list
>> gnhlug-discuss at mail.gnhlug.org
>> http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
More information about the gnhlug-discuss
mailing list