10.5: Beware using Time Machine with a failing disk

Feb 06, '08 07:30:04AM

Contributed by: denty

I noticed some strange log messages and beachball hangs recently while using my MacBook Pro. The log message read MaBi kernel[0]: disk0s2: 0xe0030005 (UNDEFINED). I came to the conclusion that the internal disk was failing (see this blog post and associated comments for some discussion on the error message).

I thought the logical course of action was to attach my Time Machine backup disk, do one last backup, then get the drive replaced and do a restore. It turns out this was not a good idea, and it resulted in the wholesale trashing of the Time Machine backup. Only because I caught the failure early enough, when not much was lost (and noticed what Time Machine was doing), was I able still to take a "normal" rsync-style backup from which to restore from tomorrow when the replacement arrives.

I don't profess to have investigated the matter thoroughly enough to be certain. However, what appears to happen is that Time Machine attempts to run a backup which terminates due to an I/O error like this:

Jan 28 22:41:02 guinness kernel[0]: disk0s2: 0xe0030005 (UNDEFINED).
Jan 28 22:41:03 guinness kernel[0]: 
Jan 28 22:41:02 guinness /System/Library/CoreServices/backupd[2963]: Error: (-36) copying /Users/denty/Pictures/From camera 01:11:2007/DSCF0181.RAF to (null)
But, bizarrely, Time Machine does not post an alert to the logged in user: the error above never even gets seen unless you happen to look in Console.app. Worse, it looks as if Time Machine itself actually thinks that the backup completed successfully. Any files that were not yet copied, including any later files that might have been fine, Time Machine seems to assume they have "gone away."

When I went to look in what was left of my Time Machine backup, I found my whole Desktop was present and correct, but only a small portion of my Pictures library was there: presumably it only contained those files copied before the error caused the backup to stop.

I then thought moving the bad files somewhere that doesn't get backed up by Time Machine would make it so that I could at least complete a full backup, but that only made things worse. This time, because the previous backup was missing so many files, Time Machine decided it needed a full 40 GB extra space, and reclaimed that by removing some of the old backups. Of course, what then happened was that the backup failed on a different bad file, but by this time so much had been removed from my backup that it was virtually worthless. It was at this point I decided to quickly take the rsync backup option.

The moral is to be very careful using Time Machine if you have a questionable internal drive. And if you do, best thing to do is to turn off Time Machine's automatic backups and find some other way of preserving any recently-changed files if you need them.

I'm going to file a bug with Apple, as I think Time Machine should provide better feedback. But perhaps this is something others in a similar position should be careful of. Hopefully this is something that'll be fixed in 10.5.2.

Comments (7)


Mac OS X Hints
http://hints.macworld.com/article.php?story=20080129063748614