|
Exploring Linux Part 16
by Alan German
OPCUG
members will be well aware of my penchant for all things
backup, and will likely have noted my recent discovery of
QuickShadow, a Windows' utility, that offers real-time
file backup from a source to a target disk drive. Having
found this to be a really useful package for continually
backing up all of my data files when working in Windows,
I turned my mind to see if the same sort of functionality
was available in the Linux world. After all, why should
Windows' users have all the fun?
My web searches initially located two sorts of products.
Firstly, there was a motley assortment of unfinished, and
seemingly-abandoned, open-source applications that did
not look at all promising. Secondly, there were several
server-based solutions that appeared to be overly
complicated to implement. However, I persevered, using
several different search terms, and scanning a number of
different web sites, until I came across a reference to
the inotify feature that is present in recent
Linux kernels. Inotify monitors disk activity and, in
particular, flags when files are written to disk or
deleted. A little more searching located a package that
combines inotify's file event monitoring with the rsync
file synchronization utility in order to provide the
real-time file backup capability that I was seeking.
The software, named inosync, is actually a Python script,
effectively provided as open-source code, by the author,
Benedikt Böhm from Germany (http://bb.xnull.de/).
The bad news was that I had absolutely no knowledge of
the Python scripting language. The good news was that
Benedikt had provided a set of files that, with just a
little simple tweaking, could be customized for any
end-user's system. Furthermore, not only is inosync
available in Ubuntu's software repositories, and so can
be downloaded and installed from the Ubuntu Software
Centre, recent versions of Ubuntu come with Python
pre-installed so that the scripts can be used
immediately.
The main script file is inosync itself (inosync.py),
located in the /usr/bin/ file folder; however, there is
usually no need to modify this file. Normal customization
of the program is handled through a second script file,
the example for which is given as sample_config.py. The
minimum changes that need to be made to this latter file
are to specify the disk or file folder to be monitored as
the source of any file updates, and the disk or file
folder to be used as the target where the backup is to be
maintained. In my case, I changed the script file name to
inosync_config.py and modified two lines in the source
code to specify my source and target disks as:
# directory that should be watched for changes
wpath = "/media/Data_Disk/"
# common remote path
rpath = "/media/USBDATADISK/D/"
Lines beginning with the # character are comments. The
wpath variable specifies the data partition of my hard
drive as the file source, and the rpath variable
specifies an external USB drive as the target for the
backup. (The /D folder on this latter disk is used for
compatibility with QuickShadow in Windows that stores its
backup to the same USB drive, but insists on locating the
file directory in the D folder since
Data_Disk is actually Drive D: under Windows.)
The command line to call inosync, and initiate the backup
process is:
inosync -c ./inosync/inosync_config.py -d -v
where the c switch indicates the name and location of the
configuration file to be used, the d switch tells inosync
to daemonize (i.e. run in the background),
and the v switch causes debugging information to be
printed as necessary.
Now, I find that this command is way too intricate to be
entered manually, so I use a simple bash script file
instead.
#!/bin/bash
echo "Mirror DataDisk to 4GB USB"
# Check that Data_Disk is mounted
if grep '/media/Data_Disk' /etc/mtab > /dev/null
2>&1; then
# Check that the 4GB USB memory stick is available
if grep '/media/USBDATADISK' /etc/mtab > /dev/null
2>&1; then
inosync -c ./inosync/inosync_config.py -d -v
else
echo " "
echo "Insert the 4GB USB memory stick and try
again"
echo " "
fi
else
echo " "
echo "Mount Data_Disk and try again"
echo " "
fi
echo "Shell command complete"
read
The above file checks that the data partition is mounted,
and that the USB drive is inserted into a port, before it
calls inosync and sets the backup in progress. If one of
the two disks is unavailable, the script issues an error
message and quits. Otherwise, the script completes and
inosync runs in the background, silently copying any new
or updated files that are written to the source drive
onto the target disk, and deleting from the target any
files that are removed from the source, all while
maintaining the source disk's directory structure.
As indicated, the silent backup process would
normally be relatively transparent to the user. However,
it is fascinating to open a Nautilus window to the source
disk, and a second instance of Nautilus for the backup
drive, and watch in more-or-less real time as changes on
the source are reflected on the target drive. Files
appear or disappear in the second window as
if by magic! But, don't be concerned if the changes you
make are not shown immediately on startup. Depending on
how big your source disk/folder/files are, rsync requires
some time to conduct the initial file synchronization.
But, once this is done, changes are shown in mere seconds
(and the time delay is one of the options that can be
configured).
The program's documentation is not very extensive
(perhaps not surprising given that the main Python script
is just two pages of code) but is adequate for the task.
However, a little further tweaking of the exemplar code
is sometimes necessary. For example, since I was
initially unaware of the time required for rsync to first
synchronize the two drives, I was concerned that the
program seemed to be hung because the changes I was
making didn't seem to be reflected in the target window.
In consequence, I tried to setup a log file to see what
was actually happening. Using the format shown in the
exemplar configuration file, I modified the entry logfile
= /var/log/inosync.log to logfile =
/home/toaster/inosync.log, hoping to create a log file in
my machine's home folder; however, I couldn't get this to
work until I placed the file name in quotes as logfile =
"/home/toaster/inosync.log". As I indicated
earlier, I don't know anything about Python scripting,
but I suspect that the above is an oversight in the
documentation since, as noted earlier, the wpath and
rpath variables both use quotes for the file names.
But, to be fair to the program's author, he is very
accessible (by E-mail), and more than willing to fix
things, as I found out when I ran across another issue
while trying to configure file exclusions. The
configuration file has an exclude list for
rsync that essentially lets you specify files that
are to be excluded from the synchronization process. I
wanted to have two files excluded, both of which were
hidden SyncToy database files, one on the source disk and
the other on the target drive. These files are used by
Microsoft's SyncToy file synchronization program to
determine which modified files and folders need to be
processed. Each file is named
SyncToy{long_text_string}.dat where the text strings are
different for the two files. Now these files pose a
problem to file mirroring software in regular operation.
The file on the source disk does not exist on the target
drive so it is copied to the target. Similarly, the
original database file on the target disk is not present
on the source drive so it is deleted from the target! The
result is that the two SyncToy database files are now
identical on the two disks and SyncToy chokes when it is
run.
Since I wanted to use SyncToy to verify that inosync was
working correctly, I needed the two files to be excluded
from the mirroring exercise. However, although I
specified the two files correctly in the configuration
file, the SyncToy file on the target was always replaced
by the file from the source when inosync was initiated. I
found a work-around for this problem by hard-coding the
two excludes in the main inosync.py file. But, I also
sent an E-mail to Benedikt asking what I was doing wrong
(remember I don't really know how to code in
Python!). The reply arrived the next day there was
a typographical error in the original code that was
preventing the script from executing the exclude command
properly. Furthermore, he provided a link for me to
download a brand-new release (Version 0,2,3) of the
package coded that day that fixed the
problem. Now, that's service!
So, now I have the best of both worlds. When running
Windows, I use QuickShadow to keep a complete backup of
my hard disk data partition on my USB drive and, when
running Linux (which is more usual), inosync maintains
the backup of the same data files to the same USB drive.
A marriage made in heaven! Thanks Benedikt!
Bottom Line:
inosync (Open Source)
Benedikt Böhm
http://github.com/hollow/inosync
Originally published: January, 2011
top of page
|
Archived Reviews
A-J
K-Q
R-Z
The opinions expressed in these reviews
do not necessarily represent the views of the
Ottawa PC Users' Group or its members.
|