Ottawa PC Users' Group (OPCUG)
 
   Home
   Reviews

 

   Copyright and Usage

   Privacy Policy

   Contact Us

 

Exploring Linux – Part 16

by Alan German

OPCUG members will be well aware of my penchant for all things backup, and will likely have noted my recent discovery of QuickShadow, a Windows' utility, that offers real-time file backup from a source to a target disk drive. Having found this to be a really useful package for continually backing up all of my data files when working in Windows, I turned my mind to see if the same sort of functionality was available in the Linux world. After all, why should Windows' users have all the fun?

My web searches initially located two sorts of products. Firstly, there was a motley assortment of unfinished, and seemingly-abandoned, open-source applications that did not look at all promising. Secondly, there were several server-based solutions that appeared to be overly complicated to implement. However, I persevered, using several different search terms, and scanning a number of different web sites, until I came across a reference to the “inotify” feature that is present in recent Linux kernels. Inotify monitors disk activity and, in particular, flags when files are written to disk or deleted. A little more searching located a package that combines inotify's file event monitoring with the rsync file synchronization utility in order to provide the real-time file backup capability that I was seeking.

The software, named inosync, is actually a Python script, effectively provided as open-source code, by the author, Benedikt Böhm from Germany (
http://bb.xnull.de/).

The bad news was that I had absolutely no knowledge of the Python scripting language. The good news was that Benedikt had provided a set of files that, with just a little simple tweaking, could be customized for any end-user's system. Furthermore, not only is inosync available in Ubuntu's software repositories, and so can be downloaded and installed from the Ubuntu Software Centre, recent versions of Ubuntu come with Python pre-installed so that the scripts can be used immediately.

The main script file is inosync itself (inosync.py), located in the /usr/bin/ file folder; however, there is usually no need to modify this file. Normal customization of the program is handled through a second script file, the example for which is given as sample_config.py. The minimum changes that need to be made to this latter file are to specify the disk or file folder to be monitored as the source of any file updates, and the disk or file folder to be used as the target where the backup is to be maintained. In my case, I changed the script file name to inosync_config.py and modified two lines in the source code to specify my source and target disks as:

# directory that should be watched for changes
wpath = "/media/Data_Disk/"

# common remote path
rpath = "/media/USBDATADISK/D/"

Lines beginning with the # character are comments. The wpath variable specifies the data partition of my hard drive as the file source, and the rpath variable specifies an external USB drive as the target for the backup. (The /D folder on this latter disk is used for compatibility with QuickShadow in Windows that stores its backup to the same USB drive, but insists on locating the file directory in the “D” folder since Data_Disk is actually Drive D: under Windows.)

The command line to call inosync, and initiate the backup process is:

inosync -c ./inosync/inosync_config.py -d -v

where the c switch indicates the name and location of the configuration file to be used, the d switch tells inosync to “daemonize” (i.e. run in the background), and the v switch causes debugging information to be printed as necessary.

Now, I find that this command is way too intricate to be entered manually, so I use a simple bash script file instead.

#!/bin/bash
echo "Mirror DataDisk to 4GB USB"
# Check that Data_Disk is mounted
if grep '/media/Data_Disk' /etc/mtab > /dev/null 2>&1; then
# Check that the 4GB USB memory stick is available
if grep '/media/USBDATADISK' /etc/mtab > /dev/null 2>&1; then
inosync -c ./inosync/inosync_config.py -d -v
else
echo " "
echo "Insert the 4GB USB memory stick and try again"
echo " "
fi
else
echo " "
echo "Mount Data_Disk and try again"
echo " "
fi
echo "Shell command complete"
read

The above file checks that the data partition is mounted, and that the USB drive is inserted into a port, before it calls inosync and sets the backup in progress. If one of the two disks is unavailable, the script issues an error message and quits. Otherwise, the script completes and inosync runs in the background, silently copying any new or updated files that are written to the source drive onto the target disk, and deleting from the target any files that are removed from the source, all while maintaining the source disk's directory structure.

As indicated, the “silent” backup process would normally be relatively transparent to the user. However, it is fascinating to open a Nautilus window to the source disk, and a second instance of Nautilus for the backup drive, and watch in more-or-less real time as changes on the source are reflected on the target drive. Files appear – or disappear – in the second window as if by magic! But, don't be concerned if the changes you make are not shown immediately on startup. Depending on how big your source disk/folder/files are, rsync requires some time to conduct the initial file synchronization. But, once this is done, changes are shown in mere seconds (and the time delay is one of the options that can be configured).

The program's documentation is not very extensive (perhaps not surprising given that the main Python script is just two pages of code) but is adequate for the task. However, a little further tweaking of the exemplar code is sometimes necessary. For example, since I was initially unaware of the time required for rsync to first synchronize the two drives, I was concerned that the program seemed to be hung because the changes I was making didn't seem to be reflected in the target window. In consequence, I tried to setup a log file to see what was actually happening. Using the format shown in the exemplar configuration file, I modified the entry logfile = /var/log/inosync.log to logfile = /home/toaster/inosync.log, hoping to create a log file in my machine's home folder; however, I couldn't get this to work until I placed the file name in quotes as logfile = "/home/toaster/inosync.log". As I indicated earlier, I don't know anything about Python scripting, but I suspect that the above is an oversight in the documentation since, as noted earlier, the wpath and rpath variables both use quotes for the file names.

But, to be fair to the program's author, he is very accessible (by E-mail), and more than willing to fix things, as I found out when I ran across another issue while trying to configure file exclusions. The configuration file has an “exclude list for rsync” that essentially lets you specify files that are to be excluded from the synchronization process. I wanted to have two files excluded, both of which were hidden SyncToy database files, one on the source disk and the other on the target drive. These files are used by Microsoft's SyncToy file synchronization program to determine which modified files and folders need to be processed. Each file is named SyncToy{long_text_string}.dat where the text strings are different for the two files. Now these files pose a problem to file mirroring software in regular operation. The file on the source disk does not exist on the target drive so it is copied to the target. Similarly, the original database file on the target disk is not present on the source drive so it is deleted from the target! The result is that the two SyncToy database files are now identical on the two disks and SyncToy chokes when it is run.

Since I wanted to use SyncToy to verify that inosync was working correctly, I needed the two files to be excluded from the mirroring exercise. However, although I specified the two files correctly in the configuration file, the SyncToy file on the target was always replaced by the file from the source when inosync was initiated. I found a work-around for this problem by hard-coding the two excludes in the main inosync.py file. But, I also sent an E-mail to Benedikt asking what I was doing wrong (remember – I don't really know how to code in Python!). The reply arrived the next day – there was a typographical error in the original code that was preventing the script from executing the exclude command properly. Furthermore, he provided a link for me to download a brand-new release (Version 0,2,3) of the package – coded that day – that fixed the problem. Now, that's service!

So, now I have the best of both worlds. When running Windows, I use QuickShadow to keep a complete backup of my hard disk data partition on my USB drive and, when running Linux (which is more usual), inosync maintains the backup of the same data files to the same USB drive. A marriage made in heaven! Thanks Benedikt!


Bottom Line:

inosync (Open Source)
Benedikt Böhm
http://github.com/hollow/inosync

Originally published: January, 2011


top of page

 

 

Archived Reviews

A-J

K-Q

R-Z

 

The opinions expressed in these reviews
do not necessarily represent the views of the
Ottawa PC Users' Group or its members.