Fedora Users — Re: OCR in Fedora?

Gustav Degreef wrote:

On Mon, Jul 21, 2008 at 3:54 PM, Valent Turkovic
<valent.turkovic@xxxxxxxxx> wrote:

On Mon, Jul 21, 2008 at 12:13 PM, Paul Smith <phhs80@xxxxxxxxx> wrote:

2008/7/21 joachim.backes@xxxxxxxxxxxxxx <joachim.backes@xxxxxxxxxxxxxx>:

Does anybody do OCR using software available in Fedora? Which ones do
you use? How do you use them?
I saw an article about OCRopus [1] and how great app it is but there
is no ocropus in fedora currently.

[1]
http://arstechnica.com/news.ars/post/20071024-hands-on-with-googles-ocropus-open-source-scanning-software.html

I use gocr-0.45-2.fc9.i386

I think it comes from the fedora repo.

Tesseract is better:

yum install tesseract

Paul

--
fedora-list mailing list
fedora-list@xxxxxxxxxx
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

Hi Joachim and Paul,
do gocr and tesseract have GUIs? How are you using them? Do you get
formated text or just plain text file? Do gocr and tesseract recognise
colums? Is it possible to get formated OpenOffice Writer document that
matches the original scanned page?

I read the article I posed the link to about OCRopus and it seams that
uses tesseract but it somehow improved.

Cheers,
Valent.


I've used both gocr and tesseract on the same text.  gocr has a gui,


Sorry, but gocr has no gui. I think, gocr-gui has:

http://www.openbsd.org/4.2_packages/i386/gocr-gui-0.44.tgz-long.html

tesseract is only command line.  I've used both tools on various tiff
files.  There is a good writeup on the net, forget where on using
tesseract on ubuntu.  I got much better text recognition with
tesseract from the same original scanned text.  Never tried ocropus.
gustav



--
Joachim Backes <joachim.backes@xxxxxxxxxxxxxx>

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

-- 
fedora-list mailing list
fedora-list@xxxxxxxxxx
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list