XPDF Library

This is a library I extracted from xpdf sources. The version used was 4.03. It doesn't contain xpdf itself, but just a makefile which you can use to create a binary tar file. Usage: Running 'make test' will build pdftopng from xpdf sources with the new library.

When linking with the library, you need to provide some extra libs used by XPDF.

I could make this into a single-file project if I really wanted to. That would be fun. libpoppler does the same thing as this one, but it sucks much more. Specifically, it hides the whole xpdf part from the user and brings in GLIB/CAIRO/QT5. This causes unnecessary dependencies along with missing functionality.

An Example Project

This is a tool to extract table of contents from a PDF file. You can download the package here. Here, I'm going to present only the parts that are likely to be common to all programs using the library.

There is no global include file which contains everything. The following are the absolute minimum to include:

#include <parseargs.h>
#include <GList.h>
#include <GlobalParams.h>
#include <PDFDoc.h>
#include <config.h>
#include <TextString.h>
parseargs.h parses command line arguments. Since it's available, why not just use it? GlobalParams.h defines an extern variable called globalParams. This object must be initialized by the user. It controls parameters global to all PDF files used in this invocation of the library. This is initialized from .xpdfrc, but it could be overridden by a command line switch.

Here is a sample list of options:

static int firstPage = 1;
static int lastPage = 0;
static char ownerPassword[33] = "";
static char userPassword[33] = "";
static GBool quiet = gFalse;
static char cfgFileName[256] = "";
static GBool printVersion = gFalse;
static GBool printHelp = gFalse;
static GBool itextFormat= gFalse;

static ArgDesc argDesc[]=
{
  {"-q",   argFlag, &quiet, 0, 
          "don't print any messages or errors"},
  {"-opw", argString, ownerPassword, sizeof(ownerPassword),
   "owner password (for encrypted files)"}, 
  {"-upw", argString,   userPassword,   sizeof(userPassword),
   "user password (for encrypted files)"},
  { "-cfg", argString, cfgFileName, sizeof(cfgFileName),
            "configuration file to be used instead of .xpdfrc" },
  { "-f", argInt,  &firstPage, 0, "first page to process" },
  { "-l", argInt,  &lastPage, 0, "last page to process" },
  { "-h", argFlag, &printHelp, 0, "print usage information" },
  { "-help", argFlag, &printHelp, 0, "print usage information" },
  { "--help", argFlag, &printHelp, 0, "print usage information" },
  { "-v", argFlag, &printVersion, 0, "print version info" },
  { "-i", argFlag, &itextFormat, 0, "use itext output format" },
  { NULL }
};
Here is a typical main function. All tools distributed with xpdf (pdftopng etc.) follow pretty much the same recipe:
int main(int argc,char **argv)
{
  GBool ok;
  int exitCode;
  char *fileName;
  PDFDoc *doc;
  GlobalParams *globalParams;
  GString *ownerPW, *userPW;

  exitCode= 99;
  fixCommandLine(&argc, &argv);  
  ok= parseArgs(argDesc, &argc, argv);
fixCommandLine doesn't do anything if the platform isn't Windows. Over there, the command line is a little weird I guess.
  if (!ok || argc!=2 || printHelp || printVersion)
  {
    fprintf(stderr,"pdfTOC version %s based on xpdf version %s\n", 
         selfVersion, xpdfVersion);
    fprintf(stderr,"XPDF Copyright Info:\n%s\n", xpdfCopyright);
    fprintf(stderr,"pdfTOC Copyright Info:\n%s\n", selfCopyright);
    if (!printVersion)
       printUsage("pdfTOC", "<PDF-File>", argDesc);
    goto err0;
  }
  fileName= argv[1];
xpdfVersion and xpdfCopyright are macros. It's a good idea to print those. OK, now onto the initialization:
  globalParams= new GlobalParams(cfgFileName);
  globalParams->setupBaseFonts(NULL);
  if (quiet) globalParams->setErrQuiet(quiet);
We didn't declare globalParams anywhere, it's declared and defined by the library.

Opening a file is pretty straightforward. There is no library initialization or anything. It's just a little bit of work to haul around passwords.

  ownerPW= ownerPassword[0] ? new GString(ownerPassword) : NULL;
  userPW= userPassword[0] ? new GString(userPassword) : NULL;
  doc= new PDFDoc(fileName, ownerPW, userPW);
  if (ownerPW) delete ownerPW;
  if (userPW) delete userPW;
  exitCode= 1;
  if (!doc->isOk()) goto err1;
Programs in the distribution use the linux-kernel style of error exits. I think it's good. Now, the doc object is ready for use.
  do your thing

  exitCode= 0;
err1:
  delete doc;
  delete globalParams;

err0:
  Object::memCheck(stderr); 
  gMemReport(stderr);
  return exitCode;
}
The last part about memory checks isn't mandatory but it's present in all xpdf programs. So, I leave it in, just in case we have messed something up.

Output Devices

These are based on the OutputDev class defined in OutputDev.h. These classes contain a bunch of virtual methods to do graphics operations. The most useful one is SplashOutputDev. This device does all the painting and outputs an image.

There are other output devices designed for other functions. For instance, there is one output device which doesn't do any graphics at all, but stores the images in a PDF to disk.