10.5: Organize data files for networked Time Machine use

May 23, '08 07:30:02AM

Contributed by: Anonymous

Time Machine is great, and being able to use Time Machine on a network volume is just amazing, but if you don't take care of the way your data are organized, it can quickly become a huge CPU and network bandwidth eater. To understand how, you must understand how Time Machine works. First of all, the first backup backs up everything but some exclusions covered in previous tips -- it's a simple full backup, in other words.

Then comes incremental backups. The main mechanism used to get consistent 'snapshots' of a volume during its life is as follows. Recursively from the root directory, the system checks if a directory changed (files added or deleted), then:

This is not a big issue on a local drive, because hard links are really quick to create, but when talking about network access, it can really become a huge bottleneck if there are thousands of hard links to create. That's because while a big file can be relatively quick to copy, creating a large number of little files or hard links is time consuming.

When can this become an issue, and what can you do to help prevent it? Read on for those answers...

Here are some relatively common cases that will cause Time Machine to work hard when backing up to a network volume:

Those are examples that came to my mind, but depending on the applications you are using and how they organize data, you may have more reasons to worry. Thanks to unix, there is a simple way to list big directories on your system, meaning directories that contains lot of files -- these are the directories that may cause issues when using Time Machine over the network, depending on how they're modified. Open a Terminal window and type:

sudo find /  -type d -size +35000c
This command will ask for your root password, and display every directory that contains (approximatively) 1,000 or more files. Take a look at the output, and safely ignore those whose content does not change regularly (such as application bundles, documentation folders, etc.). For the rest of them, however, be aware that they may cause delays in your networked Time Machine backups.

So what's the solution? Subfolders. Move static data into subfolders. For example, take your inbox, and create mailboxes by year, and put every message from prior years into a year-named subfolder. This subfolder will never change again, and will be backed up only once. Take your Documents » iChats/ directory and put every pre-10.5 file in it in a Backup folder. (In 10.5, iChats are already sorted into date-based folders, so you can leave those alone.)

In simple terms, make static those directories containing lot of files -- don't allow directories with thousands of files in them to change every day if you can help it. I think this tip may be useful for a lot of Mac users that keep their data for years when changing machines, and don't like to clean or archive their data.

Comments (7)


Mac OS X Hints
http://hints.macworld.com/article.php?story=20080522021747160