10 February 2007

Checksum verification of large downloads

When you download software, the vendor often provides a checksum or a digital signature. If you download the software and then compute the checksum (or verify the signature), you're reading through the download twice. If the download is large (like a Linux kernel source archive or an ISO image), it can take a long time. Here's a way to do both at once.

If the vendor provides an MD5 checksum, try this:

wget -O - http://www.example.com/large_file.tar.bz2 |\
tee huge.tar.bz2 | md5sum

The -O - option tells wget to write the download to standard output, rather than to a file. Piping that to tee writes the download to a local file (huge.tar.bz2) and to standard output, and this is piped to md5sum: the checksum is printed to the screen.

You can do the same trick for an SHA-1 checksum (or any other digest supported by openssl):

wget -O - http://www.example.com/large_file.tar.bz2 |\
tee huge.tar.bz2 | openssl dgst -sha1

If the vendor provides a detached signature, you can do a similar trick. As an example, let's use the bzip'ed 2.6.0 patch file for the Linux kernel and the corresponding signature file. First grab the signature file, then the patch file:

wget http://www.kernel.org/pub/linux/kernel/v2.6/patch-2.6.0.bz2.sign

wget -O - http://www.kernel.org/pub/linux/kernel/v2.6/patch-2.6.0.bz2 |\
tee patch-2.6.0.bz2 |\
gpg --keyserver pgp.mit.edu \
--keyserver-options auto-key-retrieve \
--verify patch-2.6.0.bz2.sign -

In this case, you're piping the download into gpg, telling it to verify the data coming in on standard input (the '-' at the end) against the detached signature file. The --keyserver and --keyserver-options items tell gpg to fetch and import the key if necessary (this example uses pgp.mit.edu as the keyserver, but there are lots: type 'keyserver' into a search engine).

No comments: