Итак, суть этой печальной истории в том, что ВНЕЗАПНО мой жесткий диск начал издеваться надо мной, когда я хотел поставить LibreOffice. После того, как система дважды перемонтировала партицию в readonly, я начал подозревать неладное. Глянул dmesg, а там! Мать моя женщина!
[ 858.617479] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
[ 858.617489] ata1.00: irq_stat 0x40000008
[ 858.617497] ata1.00: failed command: READ FPDMA QUEUED
[ 858.617512] ata1.00: cmd 60/00:00:30:e9:78/02:00:2a:00:00/40 tag 0 ncq 262144 in
[ 858.617514] res 41/40:00:89:e9:78/b1:00:2a:00:00/40 Emask 0x409 (media error) <F>
[ 858.617521] ata1.00: status: { DRDY ERR }
[ 858.617526] ata1.00: error: { UNC }
[ 858.621932] ata1.00: configured for UDMA/133
[ 858.621952] ata1: EH complete
[ 861.617427] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
[ 861.617438] ata1.00: irq_stat 0x40000008
[ 861.617446] ata1.00: failed command: READ FPDMA QUEUED
[ 861.617461] ata1.00: cmd 60/00:00:30:e9:78/02:00:2a:00:00/40 tag 0 ncq 262144 in
[ 861.617464] res 41/40:00:89:e9:78/b1:00:2a:00:00/40 Emask 0x409 (media error) <F>
[ 861.617471] ata1.00: status: { DRDY ERR }
[ 861.617476] ata1.00: error: { UNC }
[ 861.621883] ata1.00: configured for UDMA/133
[ 861.621902] ata1: EH complete
[ 864.276812] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
[ 864.276819] ata1.00: irq_stat 0x40000008
[ 864.276826] ata1.00: failed command: READ FPDMA QUEUED
[ 864.276840] ata1.00: cmd 60/00:00:30:e9:78/02:00:2a:00:00/40 tag 0 ncq 262144 in
[ 864.276843] res 41/40:00:89:e9:78/b1:00:2a:00:00/40 Emask 0x409 (media error) <F>
[ 864.276849] ata1.00: status: { DRDY ERR }
[ 864.276854] ata1.00: error: { UNC }
[ 864.280789] ata1.00: configured for UDMA/133
[ 864.280801] ata1: EH complete
[ 866.967174] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
[ 866.967185] ata1.00: irq_stat 0x40000008
[ 866.967193] ata1.00: failed command: READ FPDMA QUEUED
[ 866.967208] ata1.00: cmd 60/00:00:30:e9:78/02:00:2a:00:00/40 tag 0 ncq 262144 in
[ 866.967211] res 41/40:00:89:e9:78/b1:00:2a:00:00/40 Emask 0x409 (media error) <F>
[ 866.967217] ata1.00: status: { DRDY ERR }
[ 866.967222] ata1.00: error: { UNC }
[ 866.971168] ata1.00: configured for UDMA/133
[ 866.971186] ata1: EH complete
[ 870.317106] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
[ 870.317116] ata1.00: irq_stat 0x40000008
[ 870.317124] ata1.00: failed command: READ FPDMA QUEUED
[ 870.317139] ata1.00: cmd 60/00:00:30:e9:78/02:00:2a:00:00/40 tag 0 ncq 262144 in
[ 870.317142] res 41/40:00:89:e9:78/b1:00:2a:00:00/40 Emask 0x409 (media error) <F>
[ 870.317149] ata1.00: status: { DRDY ERR }
[ 870.317154] ata1.00: error: { UNC }
[ 870.320871] ata1.00: configured for UDMA/133
[ 870.320889] ata1: EH complete
[ 873.325328] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
[ 873.325337] ata1.00: irq_stat 0x40000008
[ 873.325346] ata1.00: failed command: READ FPDMA QUEUED
[ 873.325361] ata1.00: cmd 60/00:00:30:e9:78/02:00:2a:00:00/40 tag 0 ncq 262144 in
[ 873.325364] res 41/40:00:89:e9:78/b1:00:2a:00:00/40 Emask 0x409 (media error) <F>
[ 873.325371] ata1.00: status: { DRDY ERR }
[ 873.325376] ata1.00: error: { UNC }
[ 873.328743] ata1.00: configured for UDMA/133
[ 873.328799] sd 0:0:0:0: [sda] Unhandled sense code
[ 873.328805] sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 873.328814] sd 0:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
[ 873.328825] Descriptor sense data with sense descriptors (in hex):
[ 873.328831] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[ 873.328851] 2a 78 e9 89
[ 873.328860] sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
[ 873.328872] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 2a 78 e9 30 00 02 00 00
[ 873.328891] end_request: I/O error, dev sda, sector 712567177
[ 873.328901] Buffer I/O error on device sda, logical block 89070897
[ 873.328909] Buffer I/O error on device sda, logical block 89070898
[ 873.328917] Buffer I/O error on device sda, logical block 89070899
[ 873.328924] Buffer I/O error on device sda, logical block 89070900
[ 873.328932] Buffer I/O error on device sda, logical block 89070901
[ 873.328940] Buffer I/O error on device sda, logical block 89070902
[ 873.328947] Buffer I/O error on device sda, logical block 89070903
[ 873.328955] Buffer I/O error on device sda, logical block 89070904
[ 873.328963] Buffer I/O error on device sda, logical block 89070905
[ 873.328970] Buffer I/O error on device sda, logical block 89070906
[ 873.329075] ata1: EH complete
Ну общую суть вы, думаю поняли — покупай новый жесткий диск@копируй разделы со старого.
Но! Проблема еще была в том, что некоторые разделы УЖЕ не монтировались, по причине
смерти
коррупции файловой системы. /me подумал-подумал и решил, а не попробовать ли как-то это дело замять, тем более, новый жесткий пока не входит в планы первоочерёдных покупок.
Гуугле мне подсказал парочку хороших решений, чем я незамедлительно и воспользовался.
Bad block HOWTO for smartmontools
Удивительная статья, которая мне очень сильно помогла. Для тех, кто не в ладах с языком международного общения, могу адаптировать статью, хотите — отпишитесь в комментариях.
Итак, разбор полётов, или что я сделал.
К счастью, у меня завалялась таблица разделов диска, сделанная fdisk -ul На самом деле, у меня было 10 разделов, но те три, которых здесь нет не были столь важны, как sda3+sda4(Зарезервированы под FreeBSDDragonFlyBSD, sda1(загрузочный же, ёпта!) sda7+sda5(линуксовые разделы), ну и sda6(онимэ, музыка, прочий хлам)
Первым делом, был проведён тест, на то, какие партиции умерли, а какие еще живы. К моей радости, sda1 и sda6 остались живы, но о них попозже. все остальные монтироваться НЕ ЖЕЛАЛИ, а fsck завершался с ошибкой.
я запустил smartctl -t long /dev/sda и ушел на два часа. Через пару часов, возвратившись, я увидел сию картину smartctl -l selftest /dev/sda
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90% 8962 712567177
# 2 Short offline Completed without error 00% 5188 -
# 3 Extended offline Aborted by host 90% 5188 -
Как видите, первая ошибка проявилась в блоке 712567177
Запускаем badblocks: badblocks -s -v -b 512 /dev/sda 712567277 712567077#-s показывает прогресс, -v увеличивает информативность, -b 512 - указывается размер блока, в данном случае - 512 байт, дальше указываем КОНЕЧНЫЙ и потом уже НАЧАЛЬНЫЙ блок, которые я выбрал из окружения в +-100 блоков к ошибочному.
И правда, выскакивают номера бэдблоков. Провел щадящий read-write тест (опция -n) — бэдблоки тоже продолжают появдяться. Далее я сделал то, что НЕ РЕКОММЕНДУЮ делать другим без полного понимания того, ЧТО они делают. badblocks -s -w -v -b 512 /dev/sda 712567277 712567077#-w - write-mode, заполнение определенными паттернами, МОЖЕТ ПОВРЕДИТЬ ДАННЫЕ. Что удивительно, после этого бэдблоки исчезли. Проведя еще несколько тестов smartctl, теперь уже с опцией -t short, я вычислил остальные бэдблоки и провел аналогичные операции. Теперь
smartctl -l selftest -d ata /dev/sda
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 8972 -
# 2 Extended offline Completed: read failure 90% 8962 712567177
# 3 Short offline Completed without error 00% 5188 -
# 4 Extended offline Aborted by host 90% 5188 -
Как видим, ошибок больше нет, но раз уже они появились — жди новых//уже откладываю деньги на новый хард.
Теперь о минусах такого способа. После всех манипуляций, fdisk -ul /dev/sda показал мне ПОЛНОСТЬЮ голый диск. Запускаем testdisk, находим разделы. К сожалению, sda6 не был найден. Ну ладно, мы не боимся этого. fdisk -u /dev/sda, дальше жмём n/*новый раздел*/ дальше вводим какой тип раздела, logical или primary, потом вводим начало и конец сектора/*ну недаром же у меня была таблица разделов?*/, записываем таблицу разделов — w и выходим из fdisk -q.
пишем partprobe, чтобы ядро узнало о новом разделе, вуаля — раздел появился и он полностью жив.
Итак, любые вопросы, предложения и замечания буду рад прочитать в комментариях.
0
2
На ЖД 2 шифрованных раздела, boot на luks и root на luks2+btrfs. После аварийного отключения(с розетки) перестала запускаться система. Проблема в ЖД, вероятно программная.
Первый, загрузочный, раздел расшифровывается и прекрасно работает. А корневой расшифровывается, но срази же выдает ошибку:
ata1.00: exception Emask 0x0 SAct 0x1000 SErr 0x40000 action 0x0
ata1.00: irq_stat 0x40000008
ata1: SError: { CommWake }
ata1.00: failed command: READ FPDMA QUEUED
ata1.00: cmd 60/08:60:70:cb:46/00:00:23:00:00/40 tag 12 ncq 4096
in
res 51/40:08:70:cb:46/00:00:23:00:00/40 Emask 0x409 (media error) <F>
ata1.00: status: { DRDY ERR }
ata1.00: error: { UNC }
blk_update_request: I/O error, dev sda, sector 6512424 op 0x0:(READ) flags 0x0 phys_seq 1 prio class 0
device-mapper: integrity: Error on roading tags: -5
Buffer I/O error on dev dm-1, logical block 229560224, async page read
...
Вначале думал на повреждение суперблока, но восстановить через btrfs-tools не получилось(btrfs rescue super-recover <имя раздела с btrfs>):
Buffer I/O error on dev dm-1; logical block 16, async page read
Buffer I/O error on dev dm-1; logical block 16, async page read
No valid Btrfs found on /dev/mapper/btrfs
Usage or syntax errors
Предполагаю ошибку на уровне btrfs или luks. Но там и там попытка восстановление будет грозить полной потерей данных. Может кто подскажет в чем конкретно беда. Неужели в самом диске?
mount -t btrfs /dev/mapper/btrfs /mnt выдает:
Buffer I/O error on dev dm-1; logical block 16, async page read
mount: /mnt: can't read superblock on /dev/mapper/btrfs
The program that «caused» it (really, its caused by bad hardware, it’d be more appropriate to say «the program that was the victim of it») may not even exist anymore.
E.g., send off a write, and then exit. The write will sit in the kernel buffers until the kernel performs writeback. At which point an I/O error may occur.
When the program does still exist, it will already be told of the error. For example, read will set errno to EIO. (This error may also come back from write, fsync, fdatasync, or even close.)
The reason it takes forever has nothing to do with the kernel, it’s entirely the drive. The drive spends a while retrying the read to see if it can make sense of the corrupted sector. If you don’t want this (e.g., because you’re running on RAID, and will just reschedule the sector to the disk’s mirror) you can try changing the SCT Error Recovery Control settings using smartctl. Beware that many non-enterprise disks do not support this.
Except in the case of RAID (or similar), there is no way to automatically fix it. The data has been lost. The kernel can’t fix that.
If you’re running Linux software RAID (mdraid), with even a half-recent kernel, mdraid will automatically fix it by reading the errored sector from the mirror, then writing the correct sector back to the drive with a read error.
If you’re getting this on a write instead of a read, then replace the drive.
(BTW: READ FPDMA QUEUED is not an error. Its just the (S)ATA command that failed. «Medium Error» is the error.)
|
# |
|
|
Темы: 24 Сообщения: 189 Участник с: 06 апреля 2013 |
Добрый сегодня в логах нашел вот такое:
TuxAdmin kernel: ata5.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0
TuxAdmin kernel: ata5.00: irq_stat 0x40000001
TuxAdmin kernel: ata5.00: failed command: READ FPDMA QUEUED
TuxAdmin kernel: ata5.00: cmd 60/00:00:67:8e:68/01:00:73:00:00/40 tag 0 ncq 131072 in
res 41/00:1b:4c:8e:68/00:00:73:00:00/40 Emask 0x1 (device error)
TuxAdmin kernel: ata5.00: status: { DRDY ERR }
TuxAdmin kernel: ata5.00: failed command: READ FPDMA QUEUED
TuxAdmin kernel: ata5.00: cmd 60/00:08:67:8d:68/01:00:73:00:00/40 tag 1 ncq 131072 in
res 41/40:00:4c:8e:68/00:00:73:00:00/40 Emask 0x409 (media error) <F>
TuxAdmin kernel: ata5.00: status: { DRDY ERR }
TuxAdmin kernel: ata5.00: error: { UNC }
TuxAdmin kernel: ata5.00: configured for UDMA/133
TuxAdmin kernel: ata5: exception Emask 0x1 SAct 0x0 SErr 0x0 action 0x0 t4
TuxAdmin kernel: ata5: irq_stat 0x40000008
TuxAdmin kernel: sd 4:0:0:0: [sdc] Unhandled sense code
TuxAdmin kernel: sd 4:0:0:0: [sdc]
TuxAdmin kernel: Result: hostbyte=0x00 driverbyte=0x08
TuxAdmin kernel: sd 4:0:0:0: [sdc]
TuxAdmin kernel: Sense Key : 0x3 [current] [descriptor]
TuxAdmin kernel: Descriptor sense data with sense descriptors (in hex):
TuxAdmin kernel: 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
TuxAdmin kernel: 73 68 8e 4c
TuxAdmin kernel: sd 4:0:0:0: [sdc]
TuxAdmin kernel: ASC=0x11 ASCQ=0x4
TuxAdmin kernel: sd 4:0:0:0: [sdc] CDB:
TuxAdmin kernel: cdb[0]=0x28: 28 00 73 68 8d 67 00 01 00 00
TuxAdmin kernel: end_request: I/O error, dev sdc, sector 1936232012
TuxAdmin kernel: ata5: EH complete
TuxAdmin kernel: ata5.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
TuxAdmin kernel: ata5.00: irq_stat 0x40000008
TuxAdmin kernel: ata5.00: failed command: READ FPDMA QUEUED
TuxAdmin kernel: ata5.00: cmd 60/08:00:47:8e:68/00:00:73:00:00/40 tag 0 ncq 4096 in
res 41/40:00:4c:8e:68/00:00:73:00:00/40 Emask 0x409 (media error) <F>
TuxAdmin kernel: ata5.00: status: { DRDY ERR }
TuxAdmin kernel: ata5.00: error: { UNC }
Полез смотреть S.M.A.R.T. [[email protected]]>>sudo smartctl -A /dev/sdc ~/ :( smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.12.6-1-ARCH] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - 120 2 Throughput_Performance 0x0026 252 252 000 Old_age Always - 0 3 Spin_Up_Time 0x0023 073 071 025 Pre-fail Always - 8369 4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 1286 5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 252 252 051 Old_age Always - 0 8 Seek_Time_Performance 0x0024 252 252 015 Old_age Offline - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 22145 10 Spin_Retry_Count 0x0032 252 252 051 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 252 252 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 1260 191 G-Sense_Error_Rate 0x0022 100 100 000 Old_age Always - 75 192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age Always - 0 194 Temperature_Celsius 0x0002 058 052 000 Old_age Always - 42 (Min/Max 14/48) 195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age Always - 0 196 Reallocated_Event_Count 0x0032 252 252 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 3 198 Offline_Uncorrectable 0x0030 252 252 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0036 100 100 000 Old_age Always - 153 200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - 4 223 Load_Retry_Count 0x0032 252 252 000 Old_age Always - 0 225 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 1369 Полагаю стоит задуматься о том что бы заменить его на что-то более здоровое ?
kdeneur: https://github.com/brestows/kdeNeur |
|
dartsergius |
# |
|
Темы: 18 Сообщения: 238 Участник с: 15 декабря 2011 |
Ну у меня при таких же симптомах файловая система не создавалась. Напряжение по 12в нормальное? Мб питания не хватает? |
|
brestows |
# |
|
Темы: 24 Сообщения: 189 Участник с: 06 апреля 2013 |
винт не один еще 3 винта сверху, которые работают нормально, бп хороший 550W должно хватать, ком нагружается редко даже очень максимум перехешировать торрент и откомпилить софт мой не более того
kdeneur: https://github.com/brestows/kdeNeur |
|
kurych |
# |
|
Темы: 0 Сообщения: 1394 Участник с: 06 ноября 2011 |
Я бы для начала шлейф поменял. Если картина останется та же, тогда уже задумываться о замене винта. В любом случае своевременный бекап важных данных не помешает. |
|
brestows |
# |
|
Темы: 24 Сообщения: 189 Участник с: 06 апреля 2013 |
Ок спасибо, буду проовать смотреть что и как, потом отпишусь
kdeneur: https://github.com/brestows/kdeNeur |
|
vasek |
# |
|
Темы: 47 Сообщения: 11417 Участник с: 17 февраля 2013 |
Может пригодится — я бы все-таки запустил полный тест $ sudo smartctl —test=long /dev/sd….
Ошибки не исчезают с опытом — они просто умнеют |
|
brestows |
# |
|
Темы: 24 Сообщения: 189 Участник с: 06 апреля 2013 |
За ссылку спасибо, буду читать…..
а как посмотреть результаты теста ?
kdeneur: https://github.com/brestows/kdeNeur |
|
lampslave |
# |
|
Темы: 32 Сообщения: 4800 Участник с: 05 июля 2011 |
Результаты в том же smartctl -a отображаются (ближе к концу). |
|
vasek |
# |
|
Темы: 47 Сообщения: 11417 Участник с: 17 февраля 2013 |
Или по отдельности логи можно вывести так: — только атрибуты — $ sudo smartctl —attributes /dev/sda — только тест ……..- $ sudo smartctl —log=selftest /dev/sda — только ошибки ..- $ sudo smartctl —log=error /dev/sda Ошибки не исчезают с опытом — они просто умнеют |
|
brestows |
# |
|
Темы: 24 Сообщения: 189 Участник с: 06 апреля 2013 |
вот что мне показало:
=== START OF INFORMATION SECTION ===
Model Family: SAMSUNG SpinPoint F3
Device Model: SAMSUNG HD103SJ
Serial Number: S246JDWSC34540
LU WWN Device Id: 5 0024e9 002a0580e
Firmware Version: 1AJ100E4
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 6
SATA Version is: SATA 2.6, 3.0 Gb/s
Local Time is: Sat Jan 18 19:05:36 2014 FET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 40) The self-test routine was interrupted
by the host with a hard or soft reset.
Total time to complete Offline
data collection: ( 9300) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 155) minutes.
SCT capabilities: (0x003f) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - 120
2 Throughput_Performance 0x0026 252 252 000 Old_age Always - 0
3 Spin_Up_Time 0x0023 073 071 025 Pre-fail Always - 8369
4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 1286
5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 252 252 051 Old_age Always - 0
8 Seek_Time_Performance 0x0024 252 252 015 Old_age Offline - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 22191
10 Spin_Retry_Count 0x0032 252 252 051 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 252 252 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 1260
191 G-Sense_Error_Rate 0x0022 100 100 000 Old_age Always - 75
192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age Always - 0
194 Temperature_Celsius 0x0002 058 052 000 Old_age Always - 42 (Min/Max 14/48)
195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age Always - 0
196 Reallocated_Event_Count 0x0032 252 252 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 3
198 Offline_Uncorrectable 0x0030 252 252 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0036 100 100 000 Old_age Always - 153
200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - 4
223 Load_Retry_Count 0x0032 252 252 000 Old_age Always - 0
225 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 1369
SMART Error Log Version: 1
ATA Error Count: 23 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 23 occurred at disk power-on lifetime: 10766 hours (448 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 00 00 00 00 a0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ec 00 00 00 00 00 a0 00 00:00:00.951 IDENTIFY DEVICE
ef 03 42 00 00 00 a0 00 00:00:00.951 SET FEATURES [Set transfer mode]
ef 10 02 00 00 00 a0 00 00:00:00.951 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 e0 00 00:00:00.951 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 00:00:00.951 IDENTIFY DEVICE
Error 22 occurred at disk power-on lifetime: 10766 hours (448 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 00 00 00 00 a0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ec 00 00 00 00 00 a0 00 00:00:00.938 IDENTIFY DEVICE
00 10 f0 1a 91 19 40 00 00:00:00.938 NOP [Reserved subcommand] [OBS-ACS-2]
60 10 00 0a 91 19 40 00 00:00:00.938 READ FPDMA QUEUED
ef 10 02 00 00 00 a0 00 00:00:00.938 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 e0 00 00:00:00.938 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 21 occurred at disk power-on lifetime: 10766 hours (448 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 00 00 00 00 a0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ec 00 00 00 00 00 a0 00 00:00:00.587 IDENTIFY DEVICE
00 10 00 12 53 ec 40 00 00:00:00.587 NOP [Reserved subcommand] [OBS-ACS-2]
00 10 00 12 52 ec 40 00 00:00:00.587 NOP [Reserved subcommand] [OBS-ACS-2]
ec 00 00 00 00 00 a0 00 00:00:00.582 IDENTIFY DEVICE
00 10 00 fa e4 ea 40 00 00:00:00.582 NOP [Reserved subcommand] [OBS-ACS-2]
Error 20 occurred at disk power-on lifetime: 10766 hours (448 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 00 00 00 00 a0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ec 00 00 00 00 00 a0 00 00:00:00.421 IDENTIFY DEVICE
ef 03 42 00 00 00 a0 00 00:00:00.421 SET FEATURES [Set transfer mode]
ef 10 02 00 00 00 a0 00 00:00:00.421 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 e0 00 00:00:00.421 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 00:00:00.421 IDENTIFY DEVICE
Error 19 occurred at disk power-on lifetime: 10766 hours (448 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 00 00 00 00 a0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
ec 00 00 00 00 00 a0 00 00:00:00.406 IDENTIFY DEVICE
ef 03 42 00 00 00 a0 00 00:00:00.406 SET FEATURES [Set transfer mode]
ef 10 02 00 00 00 a0 00 00:00:00.406 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 e0 00 00:00:00.406 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 00:00:00.406 IDENTIFY DEVICE
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Interrupted (host reset) 80% 22171 -
# 2 Short offline Completed without error 00% 10892 -
SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Interrupted [80% left] (0-65535)
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
kdeneur: https://github.com/brestows/kdeNeur |
- Печать
Страницы: [1] Вниз
Тема: Что могут означать ошибки SMART hdd (Прочитано 1511 раз)
0 Пользователей и 1 Гость просматривают эту тему.

0x10c
Всем доброго дня! Проблема такого характера: диск работает как надо и всегда работал без ошибок, но dmesg и smartctl говорят о каких-то проблемах.
dmesg
smartctl
Подскажите в чем может быть дело это hdd накрывается или контроллер на матери глючит? Диск стоит от внешнего seagate expansion desk, изъял его из кейса, дабы подключить как внутренний hdd.

scsiman
5 Reallocated_Sector_Ct 0x0033 092 092 010 Pre-fail Always - 9240
187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 2797
197 Current_Pending_Sector 0x0012 095 001 000 Old_age Always - 888
198 Offline_Uncorrectable 0x0010 095 001 000 Old_age Offline - 888
Сдох. В помойку.
Dell Studio XPS 16, Ubuntu 16.04 LTS (Home).
HP nx6110, Ubuntu 8.04 LTS => 10.04 LTS (Home).

Sly_tom_cat
Да, диску кирдык.
Если дохлые сектора где-то в одном месте сосредоточены (имеют близке номера) то можно конечно переразметить отсавив сбойную область неразмеченной. Но такое редко случается, а главное — это довольно трудно узнать т.к. сектора которые были перенесены — они не будут в отчете о сбойных (они на другие места переехали).
Но 888 pending — это уже совсем плохо… 

0x10c
Эх понятненько, благодарю за ответы!
- Печать
Страницы: [1] Вверх
Какая-то черная полоса с техникой — сначала накрылась посудомойка, потом регистратор, а сегодня сервер домашней сети решил что весь мир подождет.
Nagios прояснил картину двумя алертами
Заглядываю в dmesg сервера, а там непрерывный поток ошибок, связанный с одним из дисков.
[1670400.363465] ata3.00: exception Emask 0x0 SAct 0x80000c00 SErr 0x0 action 0x0
[1670400.449986] ata3.00: irq_stat 0x40000008
[1670400.499057] ata3.00: failed command: READ FPDMA QUEUED
[1670400.562599] ata3.00: cmd 60/80:50:88:bd:7e/00:00:bb:00:00/40 tag 10 ncq dma 65536 in
res 51/40:30:d8:bd:7e/00:00:bb:00:00/40 Emask 0x409 (media error) <F>
[1670400.758250] ata3.00: status: { DRDY ERR }
[1670400.808368] ata3.00: error: { UNC }
[1670400.873536] ata3.00: configured for UDMA/133
[1670400.926758] sd 2:0:0:0: [sda] tag#10 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=7s
[1670401.040311] sd 2:0:0:0: [sda] tag#10 Sense Key : Medium Error [current]
[1670401.122676] sd 2:0:0:0: [sda] tag#10 Add. Sense: Unrecovered read error - auto reallocate failed
[1670401.229917] sd 2:0:0:0: [sda] tag#10 CDB: Read(16) 88 00 00 00 00 00 bb 7e bd 88 00 00 00 80 00 00
[1670401.339226] blk_update_request: I/O error, dev sda, sector 3145645528 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[1670401.472476] ata3: EH complete
[1670426.313174] ata3.00: exception Emask 0x0 SAct 0x104 SErr 0x0 action 0x0
[1670426.394513] ata3.00: irq_stat 0x40000008
[1670426.443575] ata3.00: failed command: READ FPDMA QUEUED
[1670426.507225] ata3.00: cmd 60/20:40:80:09:b2/00:00:00:00:00/40 tag 8 ncq dma 16384 in
res 51/40:20:80:09:b2/00:00:00:00:00/40 Emask 0x409 (media error) <F>
[1670426.701929] ata3.00: status: { DRDY ERR }
[1670426.752052] ata3.00: error: { UNC }
[1670426.799092] ata3.00: configured for UDMA/133
[1670426.852370] sd 2:0:0:0: [sda] tag#8 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=7s
[1670426.964863] sd 2:0:0:0: [sda] tag#8 Sense Key : Medium Error [current]
[1670427.046201] sd 2:0:0:0: [sda] tag#8 Add. Sense: Unrecovered read error - auto reallocate failed
[1670427.152496] sd 2:0:0:0: [sda] tag#8 CDB: Read(16) 88 00 00 00 00 00 00 b2 09 80 00 00 00 20 00 00
[1670427.260856] blk_update_request: I/O error, dev sda, sector 11667840 op 0x0:(READ) flags 0x0 phys_seg 4 prio class 0
[1670427.387855] md/raid1:md1: sda3: rescheduling sector 10774912
[1670427.457745] md/raid1:md1: sda3: rescheduling sector 10774920
[1670427.527534] md/raid1:md1: sda3: rescheduling sector 10774928
[1670427.597320] md/raid1:md1: sda3: rescheduling sector 10774936
[1670427.667116] ata3: EH complete
[1670429.070818] md/raid1:md1: redirecting sector 10774912 to other mirror: sdb3
[1670429.229305] md/raid1:md1: redirecting sector 10774920 to other mirror: sdb3
[1670430.301795] md/raid1:md1: redirecting sector 10774928 to other mirror: sdb3
[1670432.945317] md/raid1:md1: redirecting sector 10774936 to other mirror: sdb3
Смотрю подробности в S.M.A.R.T.
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-5.7.0-0.bpo.2-amd64] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Toshiba 3.5" DT01ACA... Desktop HDD Device Model: TOSHIBA DT01ACA300 Serial Number: Z3GHLUVGS LU WWN Device Id: 5 000039 ff4d52fc5 Firmware Version: MX6OABB0 User Capacity: 3,000,592,982,016 bytes [3.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Form Factor: 3.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Sun Sep 6 15:56:18 2020 +03 SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0 2 Throughput_Performance 0x0005 139 139 054 Pre-fail Offline - 70 3 Spin_Up_Time 0x0007 155 155 024 Pre-fail Always - 322 (Average 416) 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 40 5 Reallocated_Sector_Ct 0x0033 089 089 005 Pre-fail Always - 359 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 126 126 020 Pre-fail Offline - 32 9 Power_On_Hours 0x0012 092 092 000 Old_age Always - 56068 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 40 192 Power-Off_Retract_Count 0x0032 099 099 000 Old_age Always - 1745 193 Load_Cycle_Count 0x0012 099 099 000 Old_age Always - 1745 194 Temperature_Celsius 0x0002 139 139 000 Old_age Always - 43 (Min/Max 22/52) 196 Reallocated_Event_Count 0x0032 087 087 000 Old_age Always - 409 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
С учетом наработанных часов (56060 или больше 5 лет непрерывной работы) ругать потребительскую железку глупо — для того их и стоит две в зеркале плюс резервные копии сбрасываются на внешний диск, который хранится отдельно.
Но теперь нужно выбрать что-то на замену и оказалось это сделать непросто — если посмотреть что продается сегодня в Минске объемом 3TB или 4TB, скорость вращения шпинделя 7200RPM и стоит гуманных денег (все же для дома беру), то выбор не слишком велик:
Брать диски с 5400RPM или 5900RPM не хочется, т.к. важна latency — у дисков 7200RPM теоретически это 8.3ms, а для 5400RPM и 5900RPM это 11.1ms и 10.2ms соответственно.
Advanced Format уже никого не удивишь — нужно лишь правильно выровнять разделы, а вот Shingled Magnetic Recording (SMR) — относительно новый тренд и может подкинуть проблем если приходится много и часто писать на диск — как раз мой случай.
Некоторые производители скрывают, что диск использует технологию SMR. Toshiba недавно опубликовала информацию о потребительских HDD в которых используется SMR. Еще на хабре нашелся список дисков от разных производителей с SMR.
В итоге заказал самый бюджетный вариант (Toshiba HDWD130UZSVA) — этот диск еще и самый тихий и поддерживает SCT Error Recovery Control что очень важно для дисков в RAID.
Всем привет!
Давно я сюда не писал, интересно, кто ни будь еще читает мои пустотные заметки? Ау!
Темой сегодняшнего занятия будет восстановление зеркала на программном рейде удаленного компьютера.
Правило первое — работает, не трогай.
Правило второе — лучшее враг хорошего.
Правило третье — кто не хочет работать головой, тот будет работать руками.
На сервере, расположенном в германском дата-центре hetzner внезапно корневая файловая система решила стать только для чтения. На сервере было два сата диска, три раздела (SWAP, /boot и root) каждый из которых отдельно зазеркален через программный рейд.
Система — centos 5.
Внимательное рассмотрение показало кучу ошибок в dmesg:
SCSI error: return code = 0x08000002
Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
sda: Current [descriptor]: sense key: Medium Error
Add. Sense: Unrecovered read error - auto reallocate failed
Descriptor sense data with sense descriptors (in hex):
72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
01 02 26 46
ata2: EH complete
SCSI device sda: 2930277168 512-byte hdwr sectors (1500302 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
ata2.00: exception Emask 0x0 SAct 0x2 SErr 0x0 action 0x0
ata2.00: irq_stat 0x40000008
ata2.00: cmd 60/08:08:3f:26:02/00:00:01:00:00/40 tag 1 ncq 4096 in
res 41/40:08:46:26:02/00:00:01:00:00/00 Emask 0x409 (media
error) <F>
ata2.00: status: { DRDY ERR }
ata2.00: error: { UNC }
ata2.00: configured for UDMA/133
ata2: EH complete
SCSI device sda: 2930277168 512-byte hdwr sectors (1500302 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
ata2.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0
ata2.00: irq_stat 0x40000008
ata2.00: cmd 60/08:00:3f:26:02/00:00:01:00:00/40 tag 0 ncq 4096 in
res 41/40:08:46:26:02/8f:00:01:00:00/00 Emask 0x409 (media
error) <F>
</f></f>
Предварительный диагноз — смерть второго жесткого диска.
Саппорт подтвердил эту версию, но предупредил, что и первому диску не хорошо. Бэкапьте, говорят, все данные и будем менять диски и переставлять систему. Полная перестановка систему не очень радовала, ибо там почти террабайт данных, которые надо сначала гнать куда-то, а потом восстанавливать.
Я предложил попробовать сначала заменить совсем сдохший диск(sdb), восстановить на него рейды, после чего повторить процедуру с диском sda. Немцы согласились попробовать, но предупредили, что возможны неполадки из за сбоев на диске sda.
Так и получилось. Два маленьких раздела восстановились без разговоров, а большой — после трех процентов восстановления сбрасывался без объявления войны.
Да, а алгоритм действий следующий:
Просматриваем конфигурацию рейд-массивов:
cat /proc/mdstat
Personalities : [raid1] [raid0] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sda2[0]
264960 blocks [1/2] [U_]
md0 : active raid1 sda1[0]
2102464 blocks [1/2] [U_]
md2 : active raid1 sda3[0]
1462635200 blocks [1/2] [U_]
Просматриваем таблицу разделов на диске /dev/sda:
fdisk -l /dev/sda Disk /dev/sda: 1500.3 GB, 1500301910016 bytes 255 heads, 63 sectors/track, 182401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 1 262 2102562 fd Linux raid autodetect /dev/sda2 263 295 265072+ fd Linux raid autodetect /dev/sda3 296 182401 1462766445 fd Linux raid autodetect
И дублируем ее на диске /dev/sdb. Либо воссоздаем такую же с помощью fdisk, либо, что проще, через dd:
dd if=/dev/sda of=/dev/sdb bsize=512 count=1
И добавляем новые разделы в рейд:
mdadm /dev/md0 --add /dev/sdb1 mdadm /dev/md1 --add /dev/sdb2 mdadm /dev/md2 --add /dev/sdb3
Соответственно, последняя команда не возымела результата из за ошибок диска sda.
Не беда, меняем раздел /dev/sdb3 на тип 83 — Linux и создаем там файловую систему.
Вот тут то я и допустил первую ошибку. Для скорости копирования я разбил раздел в ext4 fs. На множестве мелких файлов она работает заметно бодрее, чем ext3. Что бы сохранить всю метаинформацию, я копировал с помощью rsync:
mkdir /olddisk; mount /dev/md2 /olddisk mkdir /newdisk; mount /dev/sdb3 /newdisk rsync -av /olddisk/* /newdisk
Процентов 15 скопировалось, после чего rsync на каком-то файле выдал ошибку. Вот он бедсектор. Сбойный файл я добавил в список исключений и продолжил копирование. К утру все перенеслось и я написал в поддержку, что можно менять и второй диск.
Надо заметить, что техподдержка у немцев работает с четкостью хорошо смазанного K98. Не взирая на время суток, в течении 5-10 минут приходит осмысленный и квалифицированый ответ на понятном английском.
Да, естественно все эти действия производились через режим rescue, при котором машина загружается с сетевого образа. В этом режиме выявился один крупный неочевидный подводный камень, но об этом позже.
После замены второго диска (sda) я воссоздал на нем таблицу разделов и добавил два маленьких раздела в рейды.
Все получилось и мне показали вот такую картинку:
cat /proc/mdstat
Personalities : [raid1] [raid0] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sdb2[1] sda2[0]
264960 blocks [2/2] [UU]
md0 : active raid1 sda1[0] sdb1[1]
2102464 blocks [2/2] [UU]
Теперь надо было разбираться с большим корневым разделом.
Сначала я изменил тип раздела /dev/sda3 обратно на fd — Linux Raid. Потом создал в нем зеркальный массив из одного диска :)) (Как ни странно в линуксе возможно даже такое извращение)
mdadm --create /dev/md2 --level=1 --raid-disk=2 missing /dev/sda3
Вот на этом месте mdadm что-то пробормотал про старые версии grub, но я пропустил это мимо себя, решив, что у меня все новое и все будет хорошо.
Создаем файловую систему в массиве и копируем туда все данные из /dev/sdb3. Вот тут я уже сглупил по полной, поскольку я копировал раздел /dev/sdb3, то и файловую систему я создал как в нем — ext4.
mkdir /olddisk; mount /dev/sdb3 /olddisk mkdir /newdisk; mount /dev/md2 /newdisk rsync -av /olddisk/* /newdisk
Пол дня на копирование данных, после чего можно посмотреть, что получилось. Теперь точка монтирования называется /newdisk.
Визуально вроде бы все в порядке. Поправляем тип корневой файловой системы в файле /newdisk/etc/fstab на ext4 и перезагружаемся. И — и тишина… Даю пол-часа на проверку фс при старте, но нет, явно что-то не так… Надо опять теребить поддержку. Добрая хетцнеровская поддержка для таких несчастных предоставляет джавовское приложение для доступа к квм-свитчу, позволяющее увидеть, что происходит на экране компьютера от самого биоса начиная. И тут мы видим, что при попытке подмонтировать корневой раздел ядро выдает ошибку и вылетает в синий экран.
Опущу два дня мучительных исследований и сразу перейду к общим выводам:
Ошибок было две, в одной виноват я сам прямо, а в другой — косвенно.
Первая ошибка простая, надо было пересобрать initrd c поддержкой ext4
mkinitrd with=ext4 with=ext3 /boot/initrd-2.6.18-128.el5.img 2.6.18-128.el5
А вот вторая ошибка была здорово по заковыристее. Дело в том, что при загрузке по сети образа rescue системы у них используется не centos, а debian 7 с существенно более современным ядром и многими прикладными пакетами. В том числе здорово более свежая версия mdadm создает по умолчанию массив с суперблоком (superblock) версии 1.2, в то время как старинное ядро 2.6.18 используемое в консервативном пятом центосе способно автоматически определять только superblock 0.9 при загрузке системы. Способ откатить суперблок до 0.9 не снося при этом массив науке не известен.
Решение проблемы собственно описано на сайте centos, если понять в чем дело и суметь правильно сформулировать запрос. http://wiki.centos.org/HowTos/Install_On_Partitionable_RAID1 По сути нам надо подсунуть ядру конфиг файл рейдов, что бы ему не нужно было самому определять. Для этого для начала создадим такой конфиг на диске:
mdadm --detail --scan > /newdisk/etc/mdadm.conf
Теперь скачаем патч для скрипта mkinitrd и пропатчим скрипт. После эт
#Монтируем все разделы и переходим в чрут mount /dev/md2 /newdisk mount /dev/md1 /newdisk/boot mount -o bind /proc /newdisk/proc mount -o bind /dev /newdisk/dev mount -o bind /dev/shm /newdisk/dev/shm mount -o bind /sys /newdisk/sys chroot /newdisk cd /tmp wget http://wiki.centos.org/HowTos/Install_On_Partitionable_RAID1?action=AttachFile&do=get&target=mkinitrd-md_d0.patch cd /sbin cp mkinitrd mkinitrd.dist patch -p0 < /tmp/mkinitrd-md_d0.patch cd /boot mv initrd-2.6.18-128.el5.img initrd-2.6.18-128.el5.img.bak mkinitrd /boot/initrd-2.6.18-128.el5.img 2.6.18-128.el5
Теперь, для очистки совести запретим обновлять mkinitrd, для этого пропишем в /etc/yum.conf
exclude=mkinitrd*
Все! Теперь у нас восстановленный рейд и новым суперблоком и более современной файловой системой. Следует помнить, что при обновлении ядра возможно потребуется опять пересоздать initrd.
Hi everybody,
After a reboot, this error appeared on my /boot (/dev/sda1) partition. I tried to solved this but after some commands and a reboot, the partitions on the /dev/sda disk where not listed anymore. I get this error (close to the first one):
ata6.00: exception Emask 0x0 SAct 0x100000 SErr 0x0 action 0x0
ata6.00: irq_stat 0x40000008
ata6.00: failed command: READ FPDMA QUEUED
ata6.00: cmd 60/20:a0:00:00:00/00:00:00:00:00/40 tag 20 ncq dma 16384 in
res 41/40:20:00:00:00/00:00:00:00:00/40 Emask 0x409 (media error) <F>
ata6.00: status: { DRDY ERR }
ata6.00: error: { UNC }
ata6.00: configured for UDMA/100
sd 5:0:0:0: [sda] tag#20 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 5:0:0:0: [sda] tag#20 Sense Key : Medium Error [current]
sd 5:0:0:0: [sda] tag#20 Add. Sense: Unrecovered read error - auto reallocate failed
sd 5:0:0:0: [sda] tag#20 CDB: Read(10) 28 00 00 00 00 00 00 00 20 00
print_req_error: I/O error, dev sda, sector 0
ata6: EH complete
Here is the smartctl -a /dev/sda:
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.0-21-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Toshiba 2.5" HDD MQ01ABD...
Device Model: TOSHIBA MQ01ABD075
Serial Number: 346PTQUUT
LU WWN Device Id: 5 000039 561c02030
Firmware Version: AX0A4M
User Capacity: 750 156 374 016 bytes [750 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 5400 rpm
Form Factor: 2.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 2.6, 3.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Fri Feb 9 14:39:04 2018 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 194) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0
2 Throughput_Performance 0x0005 100 100 050 Pre-fail Offline - 0
3 Spin_Up_Time 0x0027 100 100 001 Pre-fail Always - 2356
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 3654
5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - 152
7 Seek_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail Offline - 0
9 Power_On_Hours 0x0032 066 066 000 Old_age Always - 13819
10 Spin_Retry_Count 0x0033 172 100 030 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 3648
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 163
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 89
193 Load_Cycle_Count 0x0032 081 081 000 Old_age Always - 197659
194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 31 (Min/Max 12/48)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 13
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 80
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 253 000 Old_age Always - 0
220 Disk_Shift 0x0002 100 100 000 Old_age Always - 0
222 Loaded_Hours 0x0032 071 071 000 Old_age Always - 11977
223 Load_Retry_Count 0x0032 100 100 000 Old_age Always - 0
224 Load_Friction 0x0022 100 100 000 Old_age Always - 0
226 Load-in_Time 0x0026 100 100 000 Old_age Always - 269
240 Head_Flying_Hours 0x0001 100 100 001 Pre-fail Offline - 0
SMART Error Log Version: 1
ATA Error Count: 568 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 568 occurred at disk power-on lifetime: 13819 hours (575 days + 19 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 60 00 00 00 40 Error: UNC at LBA = 0x00000000 = 0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 08 60 00 00 00 40 00 00:31:52.720 READ FPDMA QUEUED
ef 10 02 00 00 00 a0 00 00:31:52.720 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 e0 00 00:31:52.719 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 00:31:52.718 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 00 00:31:52.718 SET FEATURES [Set transfer mode]
Error 567 occurred at disk power-on lifetime: 13819 hours (575 days + 19 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 d0 00 00 00 40 Error: UNC at LBA = 0x00000000 = 0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 08 d0 00 00 00 40 00 00:31:48.918 READ FPDMA QUEUED
60 08 c8 e0 66 54 40 00 00:31:48.916 READ FPDMA QUEUED
60 08 b8 00 66 54 40 00 00:31:48.894 READ FPDMA QUEUED
ec 00 01 00 00 00 00 00 00:31:48.893 IDENTIFY DEVICE
ec 00 01 00 00 00 00 00 00:31:48.889 IDENTIFY DEVICE
Error 566 occurred at disk power-on lifetime: 13819 hours (575 days + 19 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 e0 00 00 00 40 Error: UNC at LBA = 0x00000000 = 0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 08 e0 00 00 00 40 00 00:31:45.086 READ FPDMA QUEUED
ef 10 02 00 00 00 a0 00 00:31:45.085 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 e0 00 00:31:45.085 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 00:31:45.084 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 00 00:31:45.084 SET FEATURES [Set transfer mode]
Error 565 occurred at disk power-on lifetime: 13819 hours (575 days + 19 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 60 00 00 00 40 Error: UNC at LBA = 0x00000000 = 0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 08 60 00 00 00 40 00 00:31:41.284 READ FPDMA QUEUED
60 08 58 e0 66 54 40 00 00:31:41.283 READ FPDMA QUEUED
60 08 48 00 66 54 40 00 00:31:41.259 READ FPDMA QUEUED
ec 00 01 00 00 00 00 00 00:31:41.257 IDENTIFY DEVICE
ec 00 01 00 00 00 00 00 00:31:41.254 IDENTIFY DEVICE
Error 564 occurred at disk power-on lifetime: 13819 hours (575 days + 19 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 78 00 00 00 40 Error: UNC at LBA = 0x00000000 = 0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 08 78 00 00 00 40 00 00:31:37.451 READ FPDMA QUEUED
60 08 70 e0 66 54 40 00 00:31:37.449 READ FPDMA QUEUED
60 08 f0 00 66 54 40 00 00:31:37.420 READ FPDMA QUEUED
ec 00 01 00 00 00 00 00 00:31:37.419 IDENTIFY DEVICE
ef 10 02 00 00 00 a0 00 00:31:37.418 SET FEATURES [Enable SATA feature]
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 13815 -
# 2 Short offline Completed: read failure 00% 13815 2048
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
The partitions where like this:
— /boot part. of ~125M
— root part. of ~20G
— /home part. in the whole left space on the disk
I don’t want to lose the data that where on the /dev/sda3 (/home) partition but the partitions can’t be listed so I wonder if there is any hope and if the disk can still be used. Maybe somebody here can help me solve this.
Last edited by Delyas (2018-02-12 17:23:33)
EDIT: . The members of this forum just helped me fix and repair a nasty hard disk error. I had run file system checks before, but what I never knew was that the default check does not update the bad block inode list.
p.H wrote:e2fsck detects and marks bad blocks only when run with the -c option.
With that one sentence, p.H saved my computer. And the advice that he and L_V gave me in this thread was priceless.
What ultimately worked for me was checking both my / (root) and /home partitions with the non-destructive read-write option, -cc from a Live CD:
Code: Select all
e2fsck -f -y -cc -C0 /dev/sda5
e2fsck -f -y -cc -C0 /dev/sda7
That check identified and repaired the affected inodes. It also wrote over the damaged files. Keep a list of those files. You will have to replace them (as explained below).
Next, I ran the checks again with the read-only option -c:
Code: Select all
e2fsck -f -y -c -C0 /dev/sda5
e2fsck -f -y -c -C0 /dev/sda7
Running the check a second time was an important step because it added a few more blocks to the bad blocks list.
Having repaired the file system, the next step was to repair the affected files:
p.H wrote:Note that e2fsck can remap bad blocks but cannot restore the unreadable contents of the affected files, so these files must be reinstalled from their respective packages.
In my case, I had a fresh install of Debian Buster and a Debian Buster Live CD, so I just copied them from the Live CD:
Code: Select all
mkdir /media/inspiron
mount /dev/sda5 /media/inspiron
cp /usr/bin/$FILE01 /media/inspiron/usr/bin/$FILE01
cp /usr/bin/$FILE02 /media/inspiron/usr/bin/$FILE02
...
umount /dev/sda5
After that, the computer booted like a charm. Importantly, it shutdown like a charm too. There were no priority 0 or 1 messages in my journalctl.
Thank you to p.H and L_V for helping me rescue this old machine! .
—————————————-
ORIGINAL POST:
After a fresh installation of Debian Buster on an old machine, the partition that contains my /home partition does not unmount at shutdown. The problem seems to be caused by an I/O error. At first glance, smartctl does not show any errors, but a deeper looks shows that the disk experienced a few errors on the / (root) partition a few years ago.
If I followed Linux Admins’ «Fixing disk problems» guide would that resolve the issue?
Thanks in advance,
— Soul
Code: Select all
$ journalctl -r -b -1 -p3
-- Logs begin at Sun 2019-05-19 13:22:05 EDT, end at Sun 2019-05-19 15:26:53 EDT. --
May 19 14:51:02 inspiron systemd[1]: Failed unmounting /home.
May 19 14:51:02 inspiron kernel: print_req_error: I/O error, dev sda, sector 162964427
May 19 14:51:02 inspiron kernel: ata1.00: error: { UNC }
May 19 14:51:02 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 14:51:02 inspiron kernel: ata1.00: cmd 60/08:88:c8:a3:b6/00:00:09:00:00/40 tag 17 ncq dma 4096 in
res 41/40:08:cb:a3:b6/00:00:09:00:00/00 Emask 0x409 (media error) <F>
May 19 14:51:02 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 14:51:02 inspiron kernel: ata1.00: irq_stat 0x40000008
May 19 14:51:02 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x20000 SErr 0x0 action 0x0
May 19 14:50:59 inspiron kernel: print_req_error: I/O error, dev sda, sector 162964427
May 19 14:50:59 inspiron kernel: ata1.00: error: { UNC }
May 19 14:50:59 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 14:50:59 inspiron kernel: ata1.00: cmd 60/20:a8:c0:a3:b6/00:00:09:00:00/40 tag 21 ncq dma 16384 in
res 41/40:20:cb:a3:b6/00:00:09:00:00/00 Emask 0x409 (media error) <F>
May 19 14:50:59 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 14:50:59 inspiron kernel: ata1.00: irq_stat 0x40000008
May 19 14:50:59 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x200000 SErr 0x0 action 0x0
May 19 14:50:43 inspiron wpa_supplicant[509]: dbus: wpa_dbus_property_changed: no property SessionLength in object /fi/w1/wpa_supplicant1/Interfaces/1
May 19 14:47:06 inspiron root[7585]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:40:19 inspiron root[7277]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:34:55 inspiron root[7129]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:27:20 inspiron root[6970]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:19:26 inspiron root[6425]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:13:30 inspiron root[6164]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:07:00 inspiron root[4631]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:01:11 inspiron root[3004]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:53:40 inspiron root[2451]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:46:29 inspiron root[2355]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:41:26 inspiron root[2260]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:33:31 inspiron root[1803]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:28:13 inspiron root[1633]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:26:48 inspiron kernel: print_req_error: I/O error, dev sda, sector 201851126
May 19 13:26:48 inspiron kernel: ata1.00: error: { UNC }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:c8:f0:00:08/00:00:0c:00:00/40 tag 25 ncq dma 4096 in
res 41/40:08:f6:00:08/00:00:0c:00:00/00 Emask 0x409 (media error) <F>
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:98:90:f6:3c/00:00:0a:00:00/40 tag 19 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:90:08:3e:d1/00:00:30:00:00/40 tag 18 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:88:08:2d:8d/00:00:15:00:00/40 tag 17 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:80:c0:3e:59/00:00:09:00:00/40 tag 16 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 61/18:30:18:7f:c5/00:00:2f:00:00/40 tag 6 ncq dma 12288 out
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: WRITE FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:28:c8:6a:71/00:00:15:00:00/40 tag 5 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: irq_stat 0x40000001
May 19 13:26:48 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x20f0060 SErr 0x0 action 0x0
May 19 13:26:45 inspiron kernel: print_req_error: I/O error, dev sda, sector 361573512
May 19 13:26:45 inspiron kernel: print_req_error: I/O error, dev sda, sector 156843584
May 19 13:26:45 inspiron kernel: print_req_error: I/O error, dev sda, sector 201851126
May 19 13:26:45 inspiron kernel: print_req_error: I/O error, dev sda, sector 359754432
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/d0:f0:88:2c:8d/00:00:15:00:00/40 tag 30 ncq dma 106496 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/00:e8:40:3e:59/01:00:09:00:00/40 tag 29 ncq dma 131072 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/08:58:08:3e:d1/00:00:30:00:00/40 tag 11 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { UNC }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/08:48:f0:00:08/00:00:0c:00:00/40 tag 9 ncq dma 4096 in
res 41/40:08:f6:00:08/00:00:0c:00:00/00 Emask 0x409 (media error) <F>
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/08:18:90:f6:3c/00:00:0a:00:00/40 tag 3 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/40:10:c0:6a:71/00:00:15:00:00/40 tag 2 ncq dma 32768 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 61/08:00:00:70:cc/00:00:31:00:00/40 tag 0 ncq dma 4096 out
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: WRITE FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: irq_stat 0x40000001
May 19 13:26:45 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x60000a0d SErr 0x0 action 0x0
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 804169080
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 361572440
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 201851904
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 201851126
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 813441024
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 201848320
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/40:e0:78:a5:ee/00:00:2f:00:00/40 tag 28 ncq dma 32768 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/08:c8:30:59:d1/00:00:30:00:00/40 tag 25 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/08:c0:58:60:92/00:00:31:00:00/40 tag 24 ncq dma 4096 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/58:60:58:28:8d/00:00:15:00:00/40 tag 12 ncq dma 45056 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/d8:58:00:04:08/06:00:0c:00:00/40 tag 11 ncq dma 897024 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { UNC }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/00:50:00:00:08/04:00:0c:00:00/40 tag 10 ncq dma 524288 in
res 41/40:00:f6:00:08/00:04:0c:00:00/00 Emask 0x409 (media error) <F>
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/40:48:00:20:7c/00:00:30:00:00/40 tag 9 ncq dma 32768 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/00:40:00:f6:07/06:00:0c:00:00/40 tag 8 ncq dma 786432 in
res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: irq_stat 0x40000001
May 19 13:26:39 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x13001f00 SErr 0x0 action 0x0
May 19 13:22:09 inspiron kernel: mei mei::55213584-9a29-4916-badf-0fb7ed682aeb:01: FW version command failed -5
May 19 13:22:09 inspiron kernel: mei mei::55213584-9a29-4916-badf-0fb7ed682aeb:01: Could not read FW version
May 19 13:22:05 inspiron kernel: ACPI: SPCR: Unexpected SPCR Access Width. Defaulting to byte size
Code: Select all
# fdisk -l
Disk /dev/sda: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Disk model: ST9500325AS
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x07f2837e
Device Boot Start End Sectors Size Id Type
/dev/sda1 63 208844 208782 102M de Dell Utility
/dev/sda2 * 208845 30928844 30720000 14.7G 7 HPFS/NTFS/exFAT
/dev/sda3 30928845 155775023 124846179 59.5G 7 HPFS/NTFS/exFAT
/dev/sda4 155782305 976768064 820985760 391.5G 5 Extended
/dev/sda5 * 155782368 177305599 21523232 10.3G 83 Linux
/dev/sda6 177307648 199903231 22595584 10.8G 82 Linux swap / Solaris
/dev/sda7 199905280 976766975 776861696 370.4G 83 Linux
Code: Select all
# smartctl -l selftest /dev/sda
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-5-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 0 -
Code: Select all
# smartctl -a /dev/sda
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-5-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Seagate Momentus 5400.6
Device Model: ST9500325AS
Serial Number: 6VEGMVRP
LU WWN Device Id: 5 000c50 03067dd6f
Firmware Version: D005DEM1
User Capacity: 500,107,862,016 bytes [500 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 5400 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 2.6, 3.0 Gb/s
Local Time is: Sun May 19 15:05:07 2019 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 139) minutes.
Conveyance self-test routine
recommended polling time: ( 3) minutes.
SCT capabilities: (0x103f) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 101 089 006 Pre-fail Always - 29958806
3 Spin_Up_Time 0x0003 099 099 085 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 091 091 020 Old_age Always - 9917
5 Reallocated_Sector_Ct 0x0033 088 088 036 Pre-fail Always - 246
7 Seek_Error_Rate 0x000f 083 060 030 Pre-fail Always - 207791365
9 Power_On_Hours 0x0032 073 073 000 Old_age Always - 23876
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 094 094 020 Old_age Always - 6861
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 1097
188 Command_Timeout 0x0032 100 096 000 Old_age Always - 3759
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 051 036 045 Old_age Always In_the_past 49 (Min/Max 49/49 #998)
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 20
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 78
193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 578157
194 Temperature_Celsius 0x0022 049 064 000 Old_age Always - 49 (0 18 0 0 0)
195 Hardware_ECC_Recovered 0x001a 053 045 000 Old_age Always - 29958806
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 4
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 4
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 22868 (153 213 0)
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 3790333358
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 1937597633
254 Free_Fall_Sensor 0x0032 100 100 000 Old_age Always - 0
SMART Error Log Version: 1
ATA Error Count: 987 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 987 occurred at disk power-on lifetime: 23876 hours (994 days + 20 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 cb a3 b6 09 Error: UNC at LBA = 0x09b6a3cb = 162964427
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 c8 a3 b6 49 00 05:50:04.649 READ FPDMA QUEUED
60 00 28 e0 a3 b6 49 00 05:50:04.617 READ FPDMA QUEUED
60 00 08 c0 a3 b6 49 00 05:50:04.515 READ FPDMA QUEUED
27 00 00 00 00 00 e0 00 05:50:04.513 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 05:50:04.512 IDENTIFY DEVICE
Error 986 occurred at disk power-on lifetime: 23876 hours (994 days + 20 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 cb a3 b6 09 Error: UNC at LBA = 0x09b6a3cb = 162964427
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 20 c0 a3 b6 49 00 05:50:02.009 READ FPDMA QUEUED
60 00 08 10 50 bb 49 00 05:50:01.961 READ FPDMA QUEUED
ea 00 00 00 00 00 a0 00 05:49:55.547 FLUSH CACHE EXT
61 00 08 a0 33 4a 49 00 05:49:55.547 WRITE FPDMA QUEUED
ea 00 00 00 00 00 a0 00 05:49:55.538 FLUSH CACHE EXT
Error 985 occurred at disk power-on lifetime: 23874 hours (994 days + 18 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 f6 00 08 0c Error: UNC at LBA = 0x0c0800f6 = 201851126
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 ff ff ff 4f 00 04:25:50.279 READ FPDMA QUEUED
60 00 08 f0 00 08 4c 00 04:25:50.257 READ FPDMA QUEUED
61 00 08 ff ff ff 4f 00 04:25:50.256 WRITE FPDMA QUEUED
60 00 08 90 f6 3c 4a 00 04:25:50.256 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 04:25:50.256 READ FPDMA QUEUED
Error 984 occurred at disk power-on lifetime: 23874 hours (994 days + 18 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 f6 00 08 0c Error: UNC at LBA = 0x0c0800f6 = 201851126
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 40 ff ff ff 4f 00 04:25:47.981 READ FPDMA QUEUED
60 00 80 28 9e 57 49 00 04:25:47.954 READ FPDMA QUEUED
60 00 40 ff ff ff 4f 00 04:25:47.953 READ FPDMA QUEUED
60 00 40 ff ff ff 4f 00 04:25:47.945 READ FPDMA QUEUED
60 00 40 ff ff ff 4f 00 04:25:47.941 READ FPDMA QUEUED
Error 983 occurred at disk power-on lifetime: 23874 hours (994 days + 18 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 f6 00 08 0c Error: UNC at LBA = 0x0c0800f6 = 201851126
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 ff ff ff 4f 00 04:25:42.214 READ FPDMA QUEUED
60 00 d8 00 04 08 4c 00 04:25:42.209 READ FPDMA QUEUED
60 00 00 00 00 08 4c 00 04:25:42.207 READ FPDMA QUEUED
60 00 00 00 f6 07 4c 00 04:25:42.207 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 04:25:42.163 READ FPDMA QUEUED
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 0 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Моя утилита (проверяет хеши файлов) исправно работает на разных Linux-серверах, но на одном из них регулярно падает с ошибкой «Input/output error».
Вывод dmesg --level=err:
[ 1.329324] ACPI BIOS Error (bug): Could not resolve symbol [_SB.PCI0.SAT0.SPT0._GTF.DSSP], AE_NOT_FOUND (20200925/psargs-330)
[ 1.329375] ACPI Error: Aborting method _SB.PCI0.SAT0.SPT0._GTF due to previous error (AE_NOT_FOUND) (20200925/psparse-529)
[ 1.336931] ACPI BIOS Error (bug): Could not resolve symbol [_SB.PCI0.SAT0.SPT0._GTF.DSSP], AE_NOT_FOUND (20200925/psargs-330)
[ 1.336980] ACPI Error: Aborting method _SB.PCI0.SAT0.SPT0._GTF due to previous error (AE_NOT_FOUND) (20200925/psparse-529)
[ 3013.266044] ata1.00: exception Emask 0x0 SAct 0x18 SErr 0x0 action 0x0
[ 3013.266054] ata1.00: irq_stat 0x40000008
[ 3013.266059] ata1.00: failed command: READ FPDMA QUEUED
[ 3013.266066] ata1.00: cmd 60/00:18:98:db:b8/01:00:44:00:00/40 tag 3 ncq dma 131072 in
res 41/40:00:60:dc:b8/00:00:44:00:00/40 Emask 0x409 (media error) <F>
[ 3013.266072] ata1.00: status: { DRDY ERR }
[ 3013.266075] ata1.00: error: { UNC }
[ 3013.281961] blk_update_request: I/O error, dev sda, sector 1152965728 op 0x0:(READ) flags 0x80700 phys_seg 2 prio class 0
[ 3956.517696] ata1.00: exception Emask 0x0 SAct 0x18 SErr 0x0 action 0x0
[ 3956.517706] ata1.00: irq_stat 0x40000008
[ 3956.517710] ata1.00: failed command: READ FPDMA QUEUED
[ 3956.517717] ata1.00: cmd 60/00:18:90:99:f6/01:00:4e:00:00/40 tag 3 ncq dma 131072 in
res 41/40:00:78:9a:f6/00:00:4e:00:00/40 Emask 0x409 (media error) <F>
[ 3956.517723] ata1.00: status: { DRDY ERR }
[ 3956.517726] ata1.00: error: { UNC }
[ 3956.533817] blk_update_request: I/O error, dev sda, sector 1324784248 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[ 3961.281636] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
[ 3961.281645] ata1.00: irq_stat 0x40000008
[ 3961.281649] ata1.00: failed command: READ FPDMA QUEUED
[ 3961.281657] ata1.00: cmd 60/08:00:78:9a:f6/00:00:4e:00:00/40 tag 0 ncq dma 4096 in
res 41/40:00:78:9a:f6/00:00:4e:00:00/40 Emask 0x409 (media error) <F>
[ 3961.281664] ata1.00: status: { DRDY ERR }
[ 3961.281667] ata1.00: error: { UNC }
[ 3961.297276] blk_update_request: I/O error, dev sda, sector 1324784248 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
В чём может быть причина?



