1. Netzwerk ist Standard-Netzwerk
tcpdump -D
Nur SYN-Pakete
# tcpdump -i"tcp[tcpflags] & (tcp-syn) != 0"
Nur ACK-Pakete
tcpdump -i"tcp[tcpflags] & (tcp-ack) != 0"
Nur FIN-Pakete
tcpdump -i"tcp[tcpflags] & (tcp-fin) != 0"
tcpdump -D
# tcpdump -i"tcp[tcpflags] & (tcp-syn) != 0"
tcpdump -i"tcp[tcpflags] & (tcp-ack) != 0"
tcpdump -i"tcp[tcpflags] & (tcp-fin) != 0"
Nachdem ich plötzlich Ein-/Ausgabefehler auf meinem Server hatte und diese nur durch eine Re-Installation beheben konnte, bin ich auf das Problem gestossen, dass ich das Hardware-RAID nur bedingt überprüfen kann.
Gefunden hier: https://gist.github.com/fxkraus/595ab82e07cd6f8e057d31bc0bc5e779?permalink_comment_id=2928495#gistcomment-2928495
DIST=$(lsb_release -c | grep "Codename:" | awk '{print $2}') # jessie wheezy or stratch whatelse wget -O - https://hwraid.le-vert.net/debian/hwraid.le-vert.net.gpg.key | sudo apt-key add - echo " deb http://hwraid.le-vert.net/debian $DIST main " > /etc/apt/sources.list.d/raidtoolRepo.list apt-get update apt-get install megacli
Die Installation von megactl und megamgr hat leider nicht funktioniert, brauche ich aber derzeit nicht.
Gefunden hier: https://forums.servethehome.com/index.php?threads/installing-lsi-megaraid-storage-manager-on-debian-10-omv-5.27676/
Download der Datei unter https://docs.broadcom.com/docs/1.23.02_StorCLI und die Datei auf die Linux-Maschine übertragen, oder den direkten Link verwenden:
wget https://docs.broadcom.com/docs-and-downloads/raid-controllers/raid-controllers-common-files/1.23.02_StorCLI.zip
Notwendige Pakete installiert:
apt-get install unzip alien
Danach storcli installieren:
unzip 1.23.02_StorCLI.zip unzip storcli_All_OS.zip cd storcli_All_OS/Linux alien --scripts storcli-1.23.02-1.noarch.rpm dpkg --install storcli_1.23.02-2_all.deb ln -s /opt/MegaRAID/storcli/storcli64 /usr/local/sbin/storcli
Eine einfache Überprüfung ist die Ausführung von storcli show:
# storcli show Status Code = 0 Status = Success Description = None Number of Controllers = 1 Host Name = e6 Operating System = Linux5.10.0-20-amd64 System Overview : =============== ---------------------------------------------------------------------------------- Ctl Model Ports PDs DGs DNOpt VDs VNOpt BBU sPR DS EHS ASOs Hlth ---------------------------------------------------------------------------------- 0 LSIMegaRAIDSAS9240-4i 8 2 1 0 1 0 Msng On - Y 1 Opt ---------------------------------------------------------------------------------- Ctl=Controller Index|DGs=Drive groups|VDs=Virtual drives|Fld=Failed PDs=Physical drives|DNOpt=DG NotOptimal|VNOpt=VD NotOptimal|Opt=Optimal Msng=Missing|Dgd=Degraded|NdAtn=Need Attention|Unkwn=Unknown sPR=Scheduled Patrol Read|DS=DimmerSwitch|EHS=Emergency Hot Spare Y=Yes|N=No|ASOs=Advanced Software Options|BBU=Battery backup unit Hlth=Health|Safe=Safe-mode boot
megacli -AdpAllInfo -aAll Status Code = 0 Status = Success Description = None Number of Controllers = 1 Host Name = e6 Operating System = Linux5.10.0-20-amd64 System Overview : =============== ---------------------------------------------------------------------------------- Ctl Model Ports PDs DGs DNOpt VDs VNOpt BBU sPR DS EHS ASOs Hlth ---------------------------------------------------------------------------------- 0 LSIMegaRAIDSAS9240-4i 8 2 1 0 1 0 Msng On - Y 1 Opt ---------------------------------------------------------------------------------- Ctl=Controller Index|DGs=Drive groups|VDs=Virtual drives|Fld=Failed PDs=Physical drives|DNOpt=DG NotOptimal|VNOpt=VD NotOptimal|Opt=Optimal Msng=Missing|Dgd=Degraded|NdAtn=Need Attention|Unkwn=Unknown sPR=Scheduled Patrol Read|DS=DimmerSwitch|EHS=Emergency Hot Spare Y=Yes|N=No|ASOs=Advanced Software Options|BBU=Battery backup unit Hlth=Health|Safe=Safe-mode boot e6 root Linux # megacli -AdpAllInfo -aAll Adapter #0 ============================================================================== Versions ================ Product Name : LSI MegaRAID SAS 9240-4i Serial No : SP52905522 FW Package Build: 20.12.1-0150 Mfg. Data ================ Mfg. Date : 07/17/15 Rework Date : 00/00/00 Revision No : 03D Battery FRU : N/A Image Versions in Flash: ================ BIOS Version : 4.36.00_4.12.05.00_0x05270000 Preboot CLI Version: 03.02-020:#%00009 WebBIOS Version : 4.0-60-e_49-Rel NVDATA Version : 3.09.03-0052 FW Version : 2.130.394-2550 Boot Block Version : 2.02.00.00-0001 Pending Images in Flash ================ None PCI Info ================ Controller Id : 0000 Vendor Id : 1000 Device Id : 0073 SubVendorId : 1000 SubDeviceId : 9241 Host Interface : PCIE ChipRevision : B2 Link Speed : 0 Number of Frontend Port: 0 Device Interface : PCIE Number of Backend Port: 8 Port : Address 0 4433221102000000 1 4433221103000000 2 0000000000000000 3 0000000000000000 4 0000000000000000 5 0000000000000000 6 0000000000000000 7 0000000000000000 HW Configuration ================ SAS Address : 500605b009769650 BBU : Absent Alarm : Absent NVRAM : Present Serial Debugger : Present Memory : Absent Flash : Present Memory Size : 0MB TPM : Absent On board Expander: Absent Upgrade Key : Absent Temperature sensor for ROC : Absent Temperature sensor for controller : Absent Settings ================ Current Time : 19:54:28 4/2, 2023 Predictive Fail Poll Interval : 300sec Interrupt Throttle Active Count : 16 Interrupt Throttle Completion : 50us Rebuild Rate : 30% PR Rate : 30% BGI Rate : 30% Check Consistency Rate : 30% Reconstruction Rate : 30% Cache Flush Interval : 4s Max Drives to Spinup at One Time : 4 Delay Among Spinup Groups : 2s Physical Drive Coercion Mode : Disabled Cluster Mode : Disabled Alarm : Disabled Auto Rebuild : Enabled Battery Warning : Disabled Ecc Bucket Size : 15 Ecc Bucket Leak Rate : 1440 Minutes Restore HotSpare on Insertion : Disabled Expose Enclosure Devices : Enabled Maintain PD Fail History : Enabled Host Request Reordering : Enabled Auto Detect BackPlane Enabled : SGPIO Load Balance Mode : Auto Use FDE Only : No Security Key Assigned : No Security Key Failed : No Security Key Not Backedup : No Default LD PowerSave Policy : Controller Defined Maximum number of direct attached drives to spin up in 1 min : 0 Auto Enhanced Import : Yes Any Offline VD Cache Preserved : No Allow Boot with Preserved Cache : No Disable Online Controller Reset : No PFK in NVRAM : No Use disk activity for locate : No POST delay : 90 seconds BIOS Error Handling : Pause on Errors Current Boot Mode :Normal Capabilities ================ RAID Level Supported : RAID0, RAID1, RAID5, RAID00, RAID10, RAID50, PRL 11, PRL 11 with spanning, SRL 3 supported, PRL11-RLQ0 DDF layout with no span, PRL11-RLQ0 DDF layout with span Supported Drives : SAS, SATA Allowed Mixing: Mix in Enclosure Allowed Mix of SAS/SATA of HDD type in VD Allowed Mix of SAS/SATA of SSD type in VD Allowed Mix of SSD/HDD in VD Allowed Status ================ ECC Bucket Count : 0 Limitations ================ Max Arms Per VD : 16 Max Spans Per VD : 8 Max Arrays : 16 Max Number of VDs : 16 Max Parallel Commands : 31 Max SGE Count : 80 Max Data Transfer Size : 8192 sectors Max Strips PerIO : 20 Max LD per array : 16 Min Strip Size : 8 KB Max Strip Size : 64 KB Max Configurable CacheCade Size: 0 GB Current Size of CacheCade : 0 GB Current Size of FW Cache : 0 MB Device Present ================ Virtual Drives : 1 Degraded : 0 Offline : 0 Physical Devices : 3 Disks : 2 Critical Disks : 0 Failed Disks : 0 Supported Adapter Operations ================ Rebuild Rate : Yes CC Rate : Yes BGI Rate : Yes Reconstruct Rate : Yes Patrol Read Rate : Yes Alarm Control : No Cluster Support : No BBU : No Spanning : Yes Dedicated Hot Spare : Yes Revertible Hot Spares : Yes Foreign Config Import : Yes Self Diagnostic : Yes Allow Mixed Redundancy on Array : No Global Hot Spares : Yes Deny SCSI Passthrough : No Deny SMP Passthrough : No Deny STP Passthrough : No Support Security : No Snapshot Enabled : No Support the OCE without adding drives : Yes Support PFK : Yes Support PI : No Support Boot Time PFK Change : No Disable Online PFK Change : No PFK TrailTime Remaining : 0 days 0 hours Support Shield State : No Block SSD Write Disk Cache Change: No Support Online FW Update : Yes Supported VD Operations ================ Read Policy : No Write Policy : No IO Policy : No Access Policy : Yes Disk Cache Policy : Yes Reconstruction : Yes Deny Locate : No Deny CC : No Allow Ctrl Encryption: No Enable LDBBM : Yes Support Breakmirror : No Power Savings : No Supported PD Operations ================ Force Online : Yes Force Offline : Yes Force Rebuild : Yes Deny Force Failed : No Deny Force Good/Bad : No Deny Missing Replace : No Deny Clear : No Deny Locate : No Support Temperature : Yes NCQ : No Disable Copyback : No Enable JBOD : No Enable Copyback on SMART : No Enable Copyback to SSD on SMART Error : Yes Enable SSD Patrol Read : No PR Correct Unconfigured Areas : Yes Enable Spin Down of UnConfigured Drives : No Disable Spin Down of hot spares : Yes Spin Down time : 30 T10 Power State : No Error Counters ================ Memory Correctable Errors : 0 Memory Uncorrectable Errors : 0 Cluster Information ================ Cluster Permitted : No Cluster Active : No Default Settings ================ Phy Polarity : 0 Phy PolaritySplit : 0 Background Rate : 30 Strip Size : 64kB Flush Time : 4 seconds Write Policy : WT Read Policy : None Cache When BBU Bad : Disabled Cached IO : No SMART Mode : Mode 6 Alarm Disable : No Coercion Mode : None ZCR Config : Unknown Dirty LED Shows Drive Activity : No BIOS Continue on Error : 3 Spin Down Mode : None Allowed Device Type : SAS/SATA Mix Allow Mix in Enclosure : Yes Allow HDD SAS/SATA Mix in VD : Yes Allow SSD SAS/SATA Mix in VD : Yes Allow HDD/SSD Mix in VD : Yes Allow SATA in Cluster : No Max Chained Enclosures : 2 Disable Ctrl-R : Yes Enable Web BIOS : Yes Direct PD Mapping : No BIOS Enumerate VDs : Yes Restore Hot Spare on Insertion : No Expose Enclosure Devices : Yes Maintain PD Fail History : Yes Disable Puncturing : Yes Zero Based Enclosure Enumeration : No PreBoot CLI Enabled : Yes LED Show Drive Activity : No Cluster Disable : Yes SAS Disable : No Auto Detect BackPlane Enable : SGPIO Use FDE Only : No Enable Led Header : No Delay during POST : 0 EnableCrashDump : No Disable Online Controller Reset : No EnableLDBBM : Yes Un-Certified Hard Disk Drives : Allow Treat Single span R1E as R10 : No Max LD per array : 16 Power Saving option : All power saving options are disabled Default spin down time in minutes: 30 Enable JBOD : Yes TTY Log In Flash : Yes Auto Enhanced Import : Yes BreakMirror RAID Support : Yes Disable Join Mirror : No Enable Shield State : No Time taken to detect CME : 60s Exit Code: 0x00 # storcli /c0 /eall /sall show Controller = 0 Status = Success Description = Show Drive Information Succeeded. Drive Information : ================= ------------------------------------------------------------------------------ EID:Slt DID State DG Size Intf Med SED PI SeSz Model Sp Type ------------------------------------------------------------------------------ 64:0 5 Onln 0 1.818 TB SATA HDD N N 512B TOSHIBA DT01ACA200 U - 64:1 4 Onln 0 1.818 TB SATA HDD N N 512B TOSHIBA DT01ACA200 U - ------------------------------------------------------------------------------ EID-Enclosure Device ID|Slt-Slot No.|DID-Device ID|DG-DriveGroup DHS-Dedicated Hot Spare|UGood-Unconfigured Good|GHS-Global Hotspare UBad-Unconfigured Bad|Onln-Online|Offln-Offline|Intf-Interface Med-Media Type|SED-Self Encryptive Drive|PI-Protection Info SeSz-Sector Size|Sp-Spun|U-Up|D-Down|T-Transition|F-Foreign UGUnsp-Unsupported|UGShld-UnConfigured shielded|HSPShld-Hotspare shielded CFShld-Configured shielded|Cpybck-CopyBack|CBShld-Copyback Shielded # smartctl -a -d megaraid,4 /dev/sda smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-20-amd64] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Toshiba 3.5" DT01ACA... Desktop HDD Device Model: TOSHIBA DT01ACA200 Serial Number: 45O51V1AS LU WWN Device Id: 5 000039 fe2c24cc3 Firmware Version: MX4OABB0 User Capacity: 2.000.398.934.016 bytes [2,00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Form Factor: 3.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Sun Apr 2 21:57:03 2023 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART Status not supported: ATA return descriptor not supported by controller firmware SMART overall-health self-assessment test result: PASSED Warning: This result is based on an Attribute check. General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (14726) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 246) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0 2 Throughput_Performance 0x0005 140 140 054 Pre-fail Offline - 68 3 Spin_Up_Time 0x0007 128 128 024 Pre-fail Always - 296 (Average 297) 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 95 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 128 128 020 Pre-fail Offline - 31 9 Power_On_Hours 0x0012 091 091 000 Old_age Always - 66492 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 95 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 116 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 116 194 Temperature_Celsius 0x0002 187 187 000 Old_age Always - 32 (Min/Max 22/53) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. # smartctl -a -d megaraid,5 /dev/sda smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-20-amd64] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Toshiba 3.5" DT01ACA... Desktop HDD Device Model: TOSHIBA DT01ACA200 Serial Number: 45O51V2AS LU WWN Device Id: 5 000039 fe2c24cc4 Firmware Version: MX4OABB0 User Capacity: 2.000.398.934.016 bytes [2,00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Form Factor: 3.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Sun Apr 2 21:57:09 2023 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART Status not supported: ATA return descriptor not supported by controller firmware SMART overall-health self-assessment test result: PASSED Warning: This result is based on an Attribute check. General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (15300) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 255) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0 2 Throughput_Performance 0x0005 138 138 054 Pre-fail Offline - 74 3 Spin_Up_Time 0x0007 128 128 024 Pre-fail Always - 295 (Average 296) 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 95 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 126 126 020 Pre-fail Offline - 32 9 Power_On_Hours 0x0012 091 091 000 Old_age Always - 66492 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 95 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 112 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 112 194 Temperature_Celsius 0x0002 181 181 000 Old_age Always - 33 (Min/Max 21/52) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
Leider hat sich ergeben, dass sich eine Festplatte schön langsam aufgelöst hat.
Status des Software-RAIDs zeigt den Ausfall der Festplatte /dev/sdb:
# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md3 : active raid1 sda4[0] sdb4[1](F) 1839090112 blocks super 1.2 [2/1] [U_] bitmap: 9/14 pages [36KB], 65536KB chunk md1 : active raid1 sda2[0] sdb2[1](F) 523712 blocks super 1.2 [2/1] [U_] md0 : active raid1 sda1[0] sdb1[1](F) 16760832 blocks super 1.2 [2/1] [U_] md2 : active raid1 sda3[0] sdb3[1](F) 1073610752 blocks super 1.2 [2/1] [U_] bitmap: 6/8 pages [24KB], 65536KB chunk unused devices: <none>
Alle Device zeigen Fehler:
root@h1 ~ # mdadm --detail /dev/md0 /dev/md0: Version : 1.2 Creation Time : Tue May 7 20:42:25 2019 Raid Level : raid1 Array Size : 16760832 (15.98 GiB 17.16 GB) Used Dev Size : 16760832 (15.98 GiB 17.16 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Fri Mar 4 11:00:59 2022 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Consistency Policy : resync Name : rescue:0 UUID : 54e5acea:e5928e65:f6d6a669:cf1fb9d2 Events : 531 Number Major Minor RaidDevice State 0 8 1 0 active sync /dev/sda1 - 0 0 1 removed 1 8 17 - faulty /dev/sdb1 # mdadm --detail /dev/md1 /dev/md1: Version : 1.2 Creation Time : Tue May 7 20:42:25 2019 Raid Level : raid1 Array Size : 523712 (511.44 MiB 536.28 MB) Used Dev Size : 523712 (511.44 MiB 536.28 MB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Fri Mar 4 06:37:29 2022 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Consistency Policy : resync Name : rescue:1 UUID : 8eb2e7c6:f87a8a3f:8f82d581:1f346131 Events : 323 Number Major Minor RaidDevice State 0 8 2 0 active sync /dev/sda2 - 0 0 1 removed 1 8 18 - faulty /dev/sdb2 # mdadm --detail /dev/md2 /dev/md2: Version : 1.2 Creation Time : Tue May 7 20:42:26 2019 Raid Level : raid1 Array Size : 1073610752 (1023.88 GiB 1099.38 GB) Used Dev Size : 1073610752 (1023.88 GiB 1099.38 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Fri Mar 4 16:12:02 2022 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Consistency Policy : bitmap Name : rescue:2 UUID : 3af3850c:507630d7:28a96c3a:84ec1f66 Events : 1849768 Number Major Minor RaidDevice State 0 8 3 0 active sync /dev/sda3 - 0 0 1 removed 1 8 19 - faulty /dev/sdb3 # mdadm --detail /dev/md3 /dev/md3: Version : 1.2 Creation Time : Tue May 7 20:42:28 2019 Raid Level : raid1 Array Size : 1839090112 (1753.89 GiB 1883.23 GB) Used Dev Size : 1839090112 (1753.89 GiB 1883.23 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Fri Mar 4 16:11:14 2022 State : clean, degraded Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Consistency Policy : bitmap Name : rescue:3 UUID : 145fb3fe:45b60dcb:2d4d904a:5df4003a Events : 674039 Number Major Minor RaidDevice State 0 8 4 0 active sync /dev/sda4 - 0 0 1 removed 1 8 20 - faulty /dev/sdb4
Jetzt die Festplatte aus dem RAID-Array nehmen:
# mdadm /dev/md0 -r /dev/sdb1 mdadm: hot removed /dev/sdb1 from /dev/md0 # mdadm /dev/md1 -r /dev/sdb2 mdadm: hot removed /dev/sdb2 from /dev/md1 # mdadm /dev/md2 -r /dev/sdb3 mdadm: hot removed /dev/sdb3 from /dev/md2 # mdadm /dev/md3 -r /dev/sdb4 mdadm: hot removed /dev/sdb4 from /dev/md3
Mal kurz feststellen, welcher Partitionstyp in Verwendung ist:
root@h1 ~ # parted -l Model: ATA ST33000651AS (scsi) Disk /dev/sda: 3001GB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 5 1049kB 2097kB 1049kB bios_grub 1 2097kB 17.2GB 17.2GB raid 2 17.2GB 17.7GB 537MB raid 3 17.7GB 1117GB 1100GB raid 4 1117GB 3001GB 1883GB raid Model: Linux Software RAID Array (md) Disk /dev/md2: 1099GB Sector size (logical/physical): 512B/512B Partition Table: loop Disk Flags: Number Start End Size File system Flags 1 0.00B 1099GB 1099GB ext4 Model: Linux Software RAID Array (md) Disk /dev/md0: 17.2GB Sector size (logical/physical): 512B/512B Partition Table: loop Disk Flags: Number Start End Size File system Flags 1 0.00B 17.2GB 17.2GB linux-swap(v1) Model: Linux Software RAID Array (md) Disk /dev/md3: 1883GB Sector size (logical/physical): 512B/512B Partition Table: loop Disk Flags: Number Start End Size File system Flags 1 0.00B 1883GB 1883GB ext4 Model: Linux Software RAID Array (md) Disk /dev/md1: 536MB Sector size (logical/physical): 512B/512B Partition Table: loop Disk Flags: Number Start End Size File system Flags 1 0.00B 536MB 536MB ext3
Herausfinden der Seriennummer der defekten Platte:
chris@h1:~$ /sbin/udevadm info --query=property --name=sdb | grep ID_SERIAL ID_SERIAL=WDC_WD3000FYYZ-01UL1B2_WD-WMC1F0E4CNYD ID_SERIAL_SHORT=WD-WMC1F0E4CNYD
Über hdparm funktioniert es nicht mehr:
# hdparm -i /dev/sdb | grep SerialNo HDIO_DRIVE_CMD(identify) failed: Input/output error HDIO_GET_IDENTITY failed: No message of desired type
Via smartctl war nix mehr auszulesen …
Jetzt der Auftrag an Hetzner …
Huch! Der Tausch der Festplatte ist innerhalb von 30 Minuten erledigt! 😳
Partitionstabelle auf die neue Festplatte übertragen:
# sgdisk --backup=sda_parttable_gpt.bak /dev/sda The operation has completed successfully. # sgdisk --load-backup=sda_parttable_gpt.bak /dev/sdb Creating new GPT entries. Warning! Current disk size doesn't match that of the backup! Adjusting sizes to match, but subsequent problems are possible! The operation has completed successfully.
und einen neue UUID für die neue Festplatte erzeugen:
# sgdisk -G /dev/sdb Warning: The kernel is still using the old partition table. The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) The operation has completed successfully.
Einmal eine Partition in das Software-RAID hängen:
# mdadm /dev/md3 -a /dev/sdb4 mdadm: added /dev/sdb4
und schauen, wie es synchronisiert:
# cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md3 : active raid1 sdb4[2] sda4[0] 1839090112 blocks super 1.2 [2/1] [U_] [>....................] recovery = 1.3% (23962624/1839090112) finish=466.6min speed=64820K/sec bitmap: 9/14 pages [36KB], 65536KB chunk md2 : active raid1 sda3[0] 1073610752 blocks super 1.2 [2/1] [U_] bitmap: 7/8 pages [28KB], 65536KB chunk md1 : active raid1 sda2[0] 523712 blocks super 1.2 [2/1] [U_] md0 : active (auto-read-only) raid1 sda1[0] 16760832 blocks super 1.2 [2/1] [U_] unused devices: <none>
Wird ein wenig dauern …
Auch die anderen Partitionen eingehängt:
root@h1 ~ # mdadm /dev/md2 -a /dev/sdb3 mdadm: added /dev/sdb3 root@h1 ~ # mdadm /dev/md1 -a /dev/sdb2 mdadm: added /dev/sdb2 root@h1 ~ # mdadm /dev/md0 -a /dev/sdb1 mdadm: added /dev/sdb1 root@h1 ~ # cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md3 : active raid1 sdb4[2] sda4[0] 1839090112 blocks super 1.2 [2/1] [U_] [>....................] recovery = 2.1% (39605888/1839090112) finish=427.5min speed=70147K/sec bitmap: 9/14 pages [36KB], 65536KB chunk md2 : active raid1 sdb3[2] sda3[0] 1073610752 blocks super 1.2 [2/1] [U_] resync=DELAYED bitmap: 7/8 pages [28KB], 65536KB chunk md1 : active raid1 sdb2[2] sda2[0] 523712 blocks super 1.2 [2/1] [U_] resync=DELAYED md0 : active raid1 sdb1[2] sda1[0] 16760832 blocks super 1.2 [2/1] [U_] resync=DELAYED unused devices: <none>
Am nächsten Morgen war die Festplatte synchronisiert:
$ cat /proc/mdstat Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md3 : active raid1 sdb4[2] sda4[0] 1839090112 blocks super 1.2 [2/2] [UU] bitmap: 1/14 pages [4KB], 65536KB chunk md2 : active raid1 sdb3[2] sda3[0] 1073610752 blocks super 1.2 [2/2] [UU] bitmap: 3/8 pages [12KB], 65536KB chunk md1 : active raid1 sdb2[2] sda2[0] 523712 blocks super 1.2 [2/2] [UU] md0 : active raid1 sdb1[2] sda1[0] 16760832 blocks super 1.2 [2/2] [UU] unused devices: <none>
Quellen:
Hetzner: Festplattenaustausch im Software-RAID
Blog Dominic Pratt: Software-RAID-Reparatur bei Hetzner