Verily I say unto thee, that Frank Cox spake thusly: > On Sat, 21 Apr 2007 22:31:51 +0100 > "Keith G. Robertson-Turner" <fedora-gmane.00003@xxxxxxxxxxxxxxxxxxxxxxx> wrote: > >> Is there any command I can use to extract the text from these PDF >> documents in a batch? I have a couple of thousand documents that need >> converting. > > man pdftohtml ~]$ man pdftohtml No manual entry for pdftohtml ~]$ sudo yum install pdf2html Nothing to do I downloaded the tarball from SourceForge, but the build fails with: make[1]: Entering directory `/home/kgr/Desktop/pdftohtml-0.39/src' g++ -g -O2 -DHAVE_CONFIG_H -DHAVE_DIRENT_H=1 -I.. -DHAVE_REWINDDIR=1 -DHAVE_POPEN=1 -I.. -I../goo -I../xpdf -I../fofi -I../splash -I -I/usr/X11R6/include -c HtmlOutputDev.cc HtmlLinks.h:22: error: extra qualification ‘HtmlLink::’ on member ‘isEqualDest’ make[1]: *** [HtmlOutputDev.o] Error 1 make[1]: Leaving directory `/home/kgr/Desktop/pdftohtml-0.39/src' make: *** [all] Error 2 Anything else I could try? -- K. http://slated.org .---- | I found [Vista] to be a dangerously unstable operating system, | which has caused me to lose data ... unfortunately this product | is unfit for any user. - [H]ardOCP, <http://tinyurl.com/3bpfs2> `---- Fedora Core release 5 (Bordeaux) on sky, running kernel 2.6.20-1.2312.fc5 00:23:45 up 4 days, 21:55, 3 users, load average: 0.69, 1.86, 1.32