Re: Convert PDF to Text?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sunday 2007-04-22 00:31:51 Keith G. Robertson-Turner wrote:
> I have some PDF documents that are photocopied text documents (embedded
> image, rather than text glyphs). When I open these with Evince, I am
> able to copy and paste the actual text. At first I though this was some
> kind of OCR process, but then I realised it's actually the document
> itself, which has the original text embedded in it (OCRed and embedded
> during the original scan).
>
> Is there any command I can use to extract the text from these PDF
> documents in a batch? I have a couple of thousand documents that need
> converting.
>
> Just curious, since if Evince can obviously do it (manually) then the
> necessary library components (at least) must be installed (FC6).
>

kwrite from koffice can read and edit .pdf  files (quite well), so you should 
be able to save it as plain text. I guess that with dcop you can make a 
script to do this with multiple files for you.

-- 
Regards,
  Doncho


[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux