Detect folder changes locally before sync or mirror
Mar 18, '05 09:26:00AM • Contributed by: MartySells
Mar 18, '05 09:26:00AM • Contributed by: MartySells
Tools like rsync and lftp do a great job of mirroring directories between systems and only copying files that have been updated. On non-LAN networks, however, they still can be slow, since they have to query the remote system over the network. lftp, in particular, is slow
This tip provides a mechanism for testing locally if anything has changed below a directory, and if so, invoking the command of choice to do an update. This is much better if you want to look for changes every minute or every five minutes, since it's nearly instantaneous.
Installation / Configuration:
Download the script and make it executable with the following commands:
Here's some information on the command and its various options:
Testing:
Try it out with something like modtest -d ~/Sites/. Go ahead and run that a few times and notice that for invocations other than the first, the MD5s match and no update was necessary. Modify a file below ~/Sites/ (touch ~/Sites/index.html), run modtest again, and this time, the log will show that an update is necessary since the MD5s don't match. You can look at the data file named ~/.modtest-data and see that it contains tab-separated lines consisting of a directory name and an MD5, with the newest entries at the bottom.
Example Without Using -cmd:
modtest exit codes are as follows:
Example Using -cmd:
Since the above example of conditionally running a command depending on if changes are detected is the most likley case, modtest supports this with the -c (or -cmd) option. Here's a working example using rsync that I have in /usr/local/bin/wwwupdate. Note that the two long modtest lines have been broken with line breaks for easier reading. They should be entered as one long line, obviously:
Once you're comfortable with modtest you'll probably want to run it from a cron job. In particular, it's nice to have a script that runs from cron that you can also call by hand. So modtest intelligently determines if STDOUT is a terminal, and will only show log messages to STDOUT if it's a terminal, but will always put them in the log file. The above wwwupdate script example, for instance, can be called from the command line, or from cron. When called from cron, it adds -q to the options passed to modtest to reduce the log volume.
If cron isn't frequent enough for you (i.e. you want to check more often than every minute), then uncomment the while/do and sleep in the first example.
Cautions! and Notes:
This tip provides a mechanism for testing locally if anything has changed below a directory, and if so, invoking the command of choice to do an update. This is much better if you want to look for changes every minute or every five minutes, since it's nearly instantaneous.
Installation / Configuration:
Download the script and make it executable with the following commands:
$ cd /path/to/desired/save/location
$ curl -O 'http://marty.feebleandfrail.org/macosxhints/modtest/modtest'
$ chmod a+rx modtest
Read the rest of the hint for some usage instructions...
Here's some information on the command and its various options:
modtest -d DIRECTORY [options]
modtest -rebuild [options]
Options
-------
-f|-file FILENAME File to read/store MD5s (default: ~/.modtest-data)
-l|-log FILENAME File to write log to (default: ~/.modtest-log)
-d|-dir DIRECTORY Directory to examine for changes
-c|-cmd COMMAND Command to run if changes found
-rebuild Rebuild MD5 file to reduce size
-noupdate Do not update MD5 data
-q|-quiet Quiet output (no debug messages)
-s|-silent Silent output (no messages at all)
-stdout Output to stdout
-nostdout No output to stdout
Version: Fri Mar 4 13:15:22 EST 2005
The -file and -log options default to two files in your home directory; you can modify the script if you want them somewhere else.
Testing:
Try it out with something like modtest -d ~/Sites/. Go ahead and run that a few times and notice that for invocations other than the first, the MD5s match and no update was necessary. Modify a file below ~/Sites/ (touch ~/Sites/index.html), run modtest again, and this time, the log will show that an update is necessary since the MD5s don't match. You can look at the data file named ~/.modtest-data and see that it contains tab-separated lines consisting of a directory name and an MD5, with the newest entries at the bottom.
Example Without Using -cmd:
modtest exit codes are as follows:
0 no changes detected or data file rebuilt OK
1 changes detected
2 error
3 help message
So a simple script to update a web site using lftp would look like this:
#!/bin/sh
modtest -nostdout -d /Users/msells/public_html/
case $? in
0) ;; # No changes
1)
echo Changes detected.
cd /Users/msells/public_html/
lftp -u user,pass 10.0.0.1/www/public_html/ <<END
mirror -R --parallel=4 --use-cache
END
;;
esac
Notice that we use the -nostdout option so that modtest doesn't output anything, since we're only interested in the exit code. We could have used -silent, which would also prevent modtest from recording things in its log file.
Example Using -cmd:
Since the above example of conditionally running a command depending on if changes are detected is the most likley case, modtest supports this with the -c (or -cmd) option. Here's a working example using rsync that I have in /usr/local/bin/wwwupdate. Note that the two long modtest lines have been broken with line breaks for easier reading. They should be entered as one long line, obviously:
#!/bin/sh
#while [ 1 ] ; do
if ! test -t 0 ; then XOPTS='-q' fi
modtest -d /barn/mira/feeble_www/public_html/ $XOPTS -c
'(time rsync --delete -a -e ssh /barn/mira/feeble_www/
public_html/ marty@www.host.com:feeble/www/public_html/) 2>&1'
modtest -d /barn/mira/feeble_marty/public_html/ $XOPTS -c
'(time rsync --delete -a -e ssh /barn/mira/feeble_marty/
public_html/ marty@www.host.com:feeble/marty/public_html/) 2>&1'
#sleep 10
#done
Getting the -c parameter right might take some fiddling; you can use something like this for testing purposes:
-c 'echo "Changes at " `date` >> /tmp/changelog'
Calling from cron:Once you're comfortable with modtest you'll probably want to run it from a cron job. In particular, it's nice to have a script that runs from cron that you can also call by hand. So modtest intelligently determines if STDOUT is a terminal, and will only show log messages to STDOUT if it's a terminal, but will always put them in the log file. The above wwwupdate script example, for instance, can be called from the command line, or from cron. When called from cron, it adds -q to the options passed to modtest to reduce the log volume.
If cron isn't frequent enough for you (i.e. you want to check more often than every minute), then uncomment the while/do and sleep in the first example.
Cautions! and Notes:
- Since the MD5s are built from the output of find directory_name -ls, there's a small chance that if the size and date of a modified file don't change, then modtest won't see it as a change. Unlikley, but possible.
- modtest has been used on OS X and Linux, and should run OK on cygwin as well.
- modtest appends a line to its data file every time that a it finds something in a directory has changed. After say a million changes, the data file could get huge, so you can reduce it with modtest -rebuild. The -rebuild option hasn't received much testing for bugs or file locking, but it does keep a backup copy of the original data file. You probably shouldn't rebuild it while other instances of modtest are using the same data file, since they might collide!
- Performance-wise, modtest mostly depends on the number of files in the directory, rather than their size, since it's really just calling find
•
[10,813 views]
