16 June 2007

xpdf

Yesterday at work I got stuck retyping a handful of PDF files. They were nothing too fancy: black text in a garden-variety font on a white background, with a logo (image) in the upper-right corner. Nothing I couldn't do in a word processor, but I didn't want to retype all that junk (I have no idea what became of the original documents--I just had the PDFs).

xpdf was a big help to me. xpdf is an open-source PDF viewer, and it comes with several command-line programs. Yesterday I was able to use two of these programs to help me recreate the documents: pdfimages extracted the logo image (in ppm format) from one of the PDFs, and pdftotext converted each PDF file to text. So I was able to use oowriter (the word processor of openoffice.org) to create the new documents, and I could just copy-and-paste from the text files generated by pdftotext. Still tedious and annoying, but better than typing from scratch (less error-prone, too).

CentOS 5 doesn't have xpdf. This experience highlighted to me how important that package is to me (I use xpdf to view PDFs all the time). So I spent some time this morning building xpdf on CentOS 5 from source, and I came up with an SRPM. I sent it to the CentOS Extras site--maybe they'll add it to their package list. If you want the spec file, leave a comment.

8 comments:

Anonymous said...

Hi there. I'm stuck at work with a vanilla (although complete) install of centos 5/x86_64. I've shunned rpm-ish distros a long time ago and use debian these days, and though I'm not new to compiling my own stuff, xpdf is a bitch when it comes to compile-time library dependencies (I dont even know which src packages to download apart from openmotif). I also don't have local root, so yum installing is not an option.

It'd be awesome if you could post a link to the package you built (binary/src) or give me instructions how to do it? Just a list of packages I need to download and build (and the configure-time commands) would be fine too. My attempts at building xpdf always has the configure step complaining that it'll build everything but xpdf (ie, pdftotext, etc.)!

thanks in advance,
marq

mbrisby said...

Sorry for taking so long to approve this comment and to reply (I was out of town for a couple of days).

Since you don't have root and since you're on x86_64, an RPM won't help much. But the error message you mention sounds hauntingly familiar. As I look at the spec file, it looks like the magic trick is the following configure-time option:

./configure --with-freetype2-includes=/usr/include/freetype2

Hope that'll be helpful.

Anonymous said...

hi, could you post your .spec file somewhere ? it would help me a lot... thanks

mbrisby said...

To the reader who asked for the spec file a couple of days ago, I refer you to a subsequent post which may save you some trouble, in which I discuss xpdf's replacement by the poppler package:

http://mbrisby.blogspot.com/2007/06/centos-5-follow-up-ii.html

And you can also download binary packages from the xpdf site:

http://foolabs.com/xpdf/download.html

Failing those, I've posted my spec file at the following URL:

http://mbrisby.org/software/rpms/xpdf.spec.txt

mbrisby said...

BTW, the patch file referenced in my spec file is available at the tug.org mirror listed on the xpdf downloads page:

ftp://tug.org/xpdf/

Unknown said...

xpdf is gone from CentOS 5. Install poppler:
yum install poppler poppler-utils
as root.

Poppler, a PDF rendering library, it's a fork of the xpdf PDF...

Unknown said...

You might be interested in http://computingfunnyfacts.blogspot.com/2008/11/xpdf-in-centos.html

mbrisby said...

@Ribalba: awesome, thanks.