Detect folder changes locally before sync or mirror

Mar 18, '05 09:26:00AM

Contributed by: MartySells

Tools like rsync and lftp do a great job of mirroring directories between systems and only copying files that have been updated. On non-LAN networks, however, they still can be slow, since they have to query the remote system over the network. lftp, in particular, is slow

This tip provides a mechanism for testing locally if anything has changed below a directory, and if so, invoking the command of choice to do an update. This is much better if you want to look for changes every minute or every five minutes, since it's nearly instantaneous.

Installation / Configuration:
Download the script and make it executable with the following commands:

$ cd /path/to/desired/save/location
$ curl -O 'http://marty.feebleandfrail.org/macosxhints/modtest/modtest'
$ chmod a+rx modtest
Read the rest of the hint for some usage instructions...

Here's some information on the command and its various options:

modtest -d DIRECTORY [options] 
modtest -rebuild [options]
  
  Options
  -------
  -f|-file FILENAME     File to read/store MD5s (default: ~/.modtest-data)
  -l|-log  FILENAME     File to write log to (default: ~/.modtest-log)
  -d|-dir  DIRECTORY    Directory to examine for changes
  -c|-cmd  COMMAND      Command to run if changes found
  
  -rebuild              Rebuild MD5 file to reduce size
  -noupdate             Do not update MD5 data
  
  -q|-quiet             Quiet output (no debug messages)
  -s|-silent            Silent output (no messages at all)
  -stdout               Output to stdout
  -nostdout             No output to stdout
  
Version: Fri Mar  4 13:15:22 EST 2005
The -file and -log options default to two files in your home directory; you can modify the script if you want them somewhere else.

Testing:
Try it out with something like modtest -d ~/Sites/. Go ahead and run that a few times and notice that for invocations other than the first, the MD5s match and no update was necessary. Modify a file below ~/Sites/ (touch ~/Sites/index.html), run modtest again, and this time, the log will show that an update is necessary since the MD5s don't match. You can look at the data file named ~/.modtest-data and see that it contains tab-separated lines consisting of a directory name and an MD5, with the newest entries at the bottom.

Example Without Using -cmd:
modtest exit codes are as follows:
  0   no changes detected or data file rebuilt OK
  1   changes detected
  2   error
  3   help message
So a simple script to update a web site using lftp would look like this:
#!/bin/sh
modtest -nostdout -d /Users/msells/public_html/
case $? in
  0) ;;  # No changes
  1) 
echo Changes detected.
cd /Users/msells/public_html/
lftp -u user,pass 10.0.0.1/www/public_html/ <<END
mirror -R --parallel=4 --use-cache
END
;;
esac
Notice that we use the -nostdout option so that modtest doesn't output anything, since we're only interested in the exit code. We could have used -silent, which would also prevent modtest from recording things in its log file.

Example Using -cmd:
Since the above example of conditionally running a command depending on if changes are detected is the most likley case, modtest supports this with the -c (or -cmd) option. Here's a working example using rsync that I have in /usr/local/bin/wwwupdate. Note that the two long modtest lines have been broken with line breaks for easier reading. They should be entered as one long line, obviously:
#!/bin/sh
#while [ 1 ] ; do
if ! test -t 0 ; then XOPTS='-q'  fi

modtest -d /barn/mira/feeble_www/public_html/ $XOPTS -c 
  '(time rsync --delete -a -e ssh /barn/mira/feeble_www/
  public_html/ marty@www.host.com:feeble/www/public_html/) 2>&1'
modtest -d /barn/mira/feeble_marty/public_html/ $XOPTS -c 
  '(time rsync --delete -a -e ssh /barn/mira/feeble_marty/
  public_html/ marty@www.host.com:feeble/marty/public_html/) 2>&1'

#sleep 10
#done
Getting the -c parameter right might take some fiddling; you can use something like this for testing purposes:
-c 'echo "Changes at " `date` >> /tmp/changelog'
Calling from cron:
Once you're comfortable with modtest you'll probably want to run it from a cron job. In particular, it's nice to have a script that runs from cron that you can also call by hand. So modtest intelligently determines if STDOUT is a terminal, and will only show log messages to STDOUT if it's a terminal, but will always put them in the log file. The above wwwupdate script example, for instance, can be called from the command line, or from cron. When called from cron, it adds -q to the options passed to modtest to reduce the log volume.

If cron isn't frequent enough for you (i.e. you want to check more often than every minute), then uncomment the while/do and sleep in the first example.

Cautions! and Notes: I could have done this in Perl, but wanted to work on my /bin/sh skills...

Comments (5)


Mac OS X Hints
http://hints.macworld.com/article.php?story=20050313201539163