- I was copying a pretty big file to another place.
- While I was waiting for the copying to complete, I found some other IO activities began to execute and affect mine a lot, I saw the size of the target file stopped to grow.
- After some times, I saw this error message thrown from OS (AIX).
- IO was STILL very busy.
cp: /backup/ORCL/USER01.dbf: The media surface is damaged.
Oh, you can't be serious! If it's a hardware failure, then I think I'm doomed. That's the only backup file I have nearby.
According to the feedback from SA, two of third disks were corrupted coincidentally under RAID 5. There's no way to heal itself.
The lesson I learned from this incident is that we should backup the key files to separate places regularly if the storage gave us little confidence. Or we should ask for stronger redundancy of RAID, say, RAID 51 to reduce damages from the root.