Re: png2txt -

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Craig White wrote:
On Mon, 2008-06-30 at 16:26 +0100, Paul Smith wrote:
On Sat, Jun 28, 2008 at 5:32 PM, Bob Goodwin USA
<bobgoodwin@xxxxxxxxxxxx> wrote:
fred smith wrote:
Is there an F8 application that will convert a .png copy of a text list
to a text file?
----
png is a picture file and there is no text.

If you want OCR (optical character recognition - software that scans a
picture for recognizable text and saves the recognized text to a file),
I would suggest tesseract.
Thanks, I will look at that.

I believe that Tesseract only understands TIF files, so you will need
to convert the png before you can OCR them.


Yes, I discovered that requirement but now I am stumped by -

  The command line is:
  tesseract <image.tif> <output> [-l langid]

I thought "-l enUS" might work but no go there.

There's no man page, only a README and that doesn't tell me about the langid
other than it wants it.  Without it I get very strange looking text.
Unfortunately, the OCR programs working in Linux are not very good
yet. In case you have access to Acrobat Professional, use it instead;
the results are usually excellent.
----
I've never used Acrobat Professional for OCR but I have gotten excellent
results from tesseract on Linux.

OP should check out...

http://www.groklaw.net/article.php?story=20061210115516438&query=tesseract

http://www.linuxjournal.com/article/9676

I do some similar thing, non-OCR but working with scanned text, and I use the netpbm package. First I convert the original format to a greyscale image (aka pgm), then convert that to a bilevel image (aka black and white) with "pgmtopbm -thr" and setting the value of the transition as needed (-val option). Those images are then easily converted to tif or whatever you need, in my case jbig images for bext compression.

--
Bill Davidsen <davidsen@xxxxxxx>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

--
fedora-list mailing list
fedora-list@xxxxxxxxxx
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux