Ottawa PC Users' Group, Inc.
 Product Review 


Exploring Linux - Part 9
by Alan German

I now have a fully-functional, production machine running Ubuntu Linux and so, in this latest in the series of articles looking at various aspects of Linux, it's time to turn our attention to maintenance issues and, in particular, to backup systems. Those who have read some of my previous reviews of Windows-based utilities will know of my fondness for file synchronization programs. So, my initial foray into backup mechanisms for Linux has followed this path. We will take a look at one open-source file synchronization utility that will provide us with a simple backup system, and learn a little about script files in the process.

I have a spare 256 MB compact flash memory card from an old digital camera kicking around. And, I am currently working on a project that involves receiving lots of word processing files by E-mail, compiling these into a single document, sending this out for review, and receiving feedback from multiple reviewers. It's important to maintain a backup of all of the files associated with this project in order that nothing gets lost in the shuffle. So, my initial goal is to mirror my working document directory onto the compact flash memory card. This needs to be done in a simple fashion that will allow me to easily and quickly make backups of the file structure as it is modified.

Linux has a great little utility – rsync – that will do precisely what I require. To quote from the program's documentation (man rsync): “Rsync copies files either to or from a remote host, or locally on the current host.” Rather than using the powerful communication capabilities of the program, we will merely transfer files between two disks on the local computer, in which case rsync serves as an enhanced copy command.

Rsync is a command-line program, so we will need to run it in a terminal window (Applications – Accessories – Terminal). The primary format of the command used is:

rsync options source destination

There is an almost mind-numbing array of optional commands (see rsync's manual for full details), but we will use just three of them: a (archive), v (verbose) and delete. Our source directory will be specified as  /mnt/windows_data/carsp/cmrsc_18/  while the flash memory card is seen by the filesystem as /media/disk. Thus, the command string to be entered in terminal's window is:

rsync -av --delete /mnt/windows_data/carsp/cmrsc_18/ /media/disk/cmrsc_18

Archive mode retains file attributes, such as ownership and permissions, when the files are transferred. Switching on verbose mode means that a complete list of the names of the files that are transferred as a result of the command is displayed. The delete option (note the double leading dashes required here) causes any files that are present on the destination drive but not on the source to be erased. The result is that any new or modified files are transferred from the source to the destination, any existing files on both systems remain unchanged, and any old files on the destination that have been deleted from the source are deleted on the destination. Thus, running this command produces a “mirror” of the files currently located in the source directory to appear on the destination drive.

There are a couple of important “tricks” to note in the specifications for the source and destination drives. Firstly, there is a trailing slash on the cmrsc_18 sub-folder specified as being the source. This tells rsync to copy all of the files (and sub-folders) from this folder on the source to the specified destination. But, most importantly, it says: don't create a cmrsc_18 folder on the destination. If we were to omit the trailing slash, a folder within a folder (i.e. .../cmrsc_18/cmrsc_18) would be created on the destination drive, which is not what we want. The other trick with the command is that, on the first time around, the cmrsc_18 folder is created on the destination drive if it doesn't already exist. With subsequent implementations of the command, the files (and folders) in the destination folder are modified to precisely match those in the source folder.

If the above seems to be a complex explanation of the nuances of the command structure, try a few simple tests. Use two temporary directories, temp (source) and temp2 (destination), with just a couple of small text or image files in temp, and see what happens when you include the trailing slash, or leave it out.. Try also adding, editing and deleting files in the source. Run the command and check that all your changes are indeed mirrored in the destination folder.

Given the directory path that I wish to be specified, the command line is rather long. And, perhaps I don't want to be bothered re-typing such a long command every time I wish to run the backup process. We should recall that previously-used commands are stored in a buffer, and can be accessed by scrolling through the command history (using the up-arrow). So, that's one way to reuse the command line. Another is to store the command in a special kind of file and run the command by typing the name of the file. Does anyone remember batch files in DOS? Welcome to the Linux world of shell scripting!

In particular, we are going to create a script file (.sh) that we will run in bash (the Bourne Again Shell). While a simple script will accomplish our desired task, it's worth noting that bash scripting is a very powerful programming technique. Because of this, many books and on-line tutorials are available to show you how to make effective use of the system.

But, for our present purposes, we need only limited knowledge of bash scripts. In fact, our script consists of just three lines of code which we type into gedit (Applications – Accessories – Text editor) and save as the file mirror_cmrsc18.sh in our home folder.

#!/bin/bash
echo "Mirror cmrsc_18 to compact flash memory card"
rsync -av --delete /mnt/windows_data/carsp/cmrsc_18/ /media/disk/cmrsc_18

The first line indicates that the file is a shell script and identifies the location of bash as being in the /bin folder. (You can check this location on your system by issuing the command “which bash”.)

The second line (quite wordy by Unix programming “standards”) serves to document the purpose of the script by displaying a message on the display screen prior to activation of rsync.

The third line is merely the command line that we would have typed manually in a terminal window to run our selected backup process.

Note that the file name chosen for the script (mirror_cmrsc18.sh) is also rather long. Perhaps a shorter name would be more desirable. It would certainly be easier to type. But, in this case, it does tell us (well, it tells me!) what the script will do.

Now we have our script, we need to know how to run it. Firstly, in a terminal window, we navigate to the home folder and issue the command chmod +x mirror_cmrsc18 to give ourselves permission to execute the file. Alternatively, pull up the Nautilus file manager (Places), right click on the file mirror_cmrsc18.sh, select Properties – Permissions, and check the box marked Execute – Allow executing file as a program. Note that if you try to run the script file without taking this step, bash will return the error message: Permission denied.

Now, we simply run the script file with the command ./mirror_cmrsc18.sh. Note the use of a dot and a slash (./) in front of the script file's name. The dot indicates that we wish to use the current directory, and the slash is a separator between the directory name and the filename. The requirement for this seemingly obscure format is a security feature in Linux. The current directory is not automatically on the path and so we must specify that the script file is to be run from the current directory by prefacing the file name with “./”. Note also that it is necessary to include the .sh file extension in the command.

But, that's it. The rsync program runs, transfers files between the two disks, and lists the files processed. Our files are safe. We have backup!

Now, let's make our bash script just a little more sophisticated by checking that the memory card is available before we try to run rsync. A few extra commands inserted into the script file will do the trick:

#!/bin/bash
echo "Mirror cmrsc_18 to compact flash memory card"
# Check that the memory card is available
if [ -d /media/disk/cmrsc_18 ]; then
   rsync -av --delete /mnt/windows_data/carsp/cmrsc_18/ /media/disk/cmrsc_18
else
   echo "Insert memory card and try again"
fi

The third line commences with the # symbol that defines this line as a comment. The workhorse statements are the “if-then-else” sequence in the last five lines. These either run rsync, or provide a warning message and exit gracefully. The command
-d /media/disk/cmrsc-18
checks if the directory is present on the removable drive. If it is, we can proceed to back up our files. If not, we need to insert the flash memory card.

So, if you yearn for the days of DOS batch files, bash scripts in Linux can bring back a whole world of enjoyment (and/or frustration!) As noted previously, bash scripting is a very powerful tool. Check out the wide variety of available commands, and the multitude of programming techniques, that can be used for all sorts of different purposes. Lots of on-line assistance is available to you, including pre-built scripts that you can readily customize for your specific applications.


Bottom Line:

rsync (Open Source)
Andrew Tridgell and Paul Mackerras
http://en.wikipedia.org/wiki/Rsync

Introduction to Bash Scripting
http://www.linuxconfig.org/Bash_scripting_Tutorial


Click here to view the full OPCUG website with frames.

Copyright and Usage
Ottawa Personal Computer Users' Group (OPCUG), Inc.
3 Thatcher Street, Ottawa, ON  K2G 1S6

The opinions expressed in these reviews do not necessarily
represent the views of the OPCUG or its members.

Send comments or suggestions to the .