Sunday, December 30, 2012

A shell recipe for backups with logs and history

I wrote a shell script for a cron job that grabs backups of some remote files. It has a few nice features:

  • Output from the backup commands is logged, with timestamps.
  • cron will send me email if one of the commands fails.
  • The history of each backup is saved in Git. Nothing sucks more than corrupting an important file and then syncing that corruption to your one and only backup.

Here's how it works.

#!/bin/bash -e

cd /home/keegan/backups
log="$(pwd)"/log

exec 3>&2 > >(ts >> "$log") 2>&1

You may have seen exec used to tail-call a command, but here we use it differently. When no command is given, exec applies file redirections to the current shell process.

We apply timestamps by redirecting output through ts (from moreutils), and append that to the log file. I would write exec | ts >> $log, except that pipe syntax is not supported with exec.

Instead we use process substitution. >(cmd) expands to the name of a file, whose contents will be sent to the specified command. This file name is a fine target for normal file output redirection with >. (It might name a temporary file created by the shell, or a special file under /dev/fd/.)

We also redirect standard error to the same place with 2>&1. But first we open the original standard error as file descriptor 3, using 3>&2.

function handle_error {
    echo 'Error occurred while running backup' >&3
    tail "$log" >&3
    exit 1
}
trap handle_error ERR

Since we specified bash -e in the first line of the script, Bash will exit as soon as any command fails. We use trap to register a function that gets called if this happens. The function writes some of the log file to the script's original standard output. cron will capture that and send mail to the system administrator.

Now we come to the actual backup commands.

cd foo
git pull

cd ../bar
rsync -v otherhost:bar/baz .
git commit --allow-empty -a -m '[AUTO] backup'
git repack -da

foo is a backup of a Git repo, so we just update a clone of that repo. If you want to be absolutely sure to preserve all commits, you can configure the backup repo to disable automatic garbage collection and keep infinite reflog.

bar is a local-only Git repo storing history of a file synced from another machine. Semantically, Git stores each version of a file as a separate blob object. If the files you're backing up are reasonably large, this can waste a lot of space quickly. But Git supports "packed" storage, where the objects in a repo are compressed together. By repacking the repo after every commit, we can save a ton of space.

Monday, December 17, 2012

Hex-editing Linux kernel modules to support new hardware

This is an old trick but a fun one. The ThinkPad X1 Carbon has no built-in Ethernet port. Instead it comes with a USB to Ethernet adapter. The adapter uses the ASIX AX88772 chip, which Linux has supported since time immemorial. But support for the particular adapter shipped by Lenovo was only added in Linux 3.7.

This was a problem for me, since I wanted to use a Debian installer with a 3.2 kernel. I could set up a build environment for that particular kernel and recompile the module. But this seemed like an annoying yak to shave when I just wanted to get the machine working.

The patch to support the Lenovo adapter just adds a new USB device ID to an existing driver:

 }, {
+       // Lenovo U2L100P 10/100
+       USB_DEVICE (0x17ef, 0x7203),
+       .driver_info = (unsigned long) &ax88772_info,
+}, {
        // ASIX AX88772B 10/100
        USB_DEVICE (0x0b95, 0x772b),
        .driver_info = (unsigned long) &ax88772_info,

As a quick-and-dirty solution, we can edit the compiled kernel module asix.ko, changing that existing device ID (0x0b95, 0x772b) to the Lenovo one (0x17ef, 0x7203). Since x86 CPUs are little-endian, this involves changing the bytes

95 0b 2b 77

to

ef 17 03 72

I wanted to do this within the Debian installer without rebooting. Busybox sed does not support hex escapes, but printf does:

sed $(printf 's/\x95\x0b\x2b\x77/\xef\x17\x03\x72/') \
    /lib/modules/$(uname -r)/kernel/drivers/net/usb/asix.ko \
    > /tmp/asix.ko

(It's worth checking that none of those bytes have untoward meanings as ASCII characters in a regular expression. As it happens, sed does not recognize + (aka \x2b) as repetition unless preceded by a backslash.)

Then I loaded the patched module along with its dependencies. A simple way is

modprobe asix
rmmod asix
insmod /tmp/asix.ko

And that was enough for me to complete the install over Ethernet. Of course, once everything is set up, it would be better to compile a properly-patched kernel using make-kpkg. I haven't got around to it yet because wireless is working great. :)