On Mon, Jul 21, 2008 at 3:54 PM, Valent Turkovic <valent.turkovic@xxxxxxxxx> wrote: > On Mon, Jul 21, 2008 at 12:13 PM, Paul Smith <phhs80@xxxxxxxxx> wrote: >> 2008/7/21 joachim.backes@xxxxxxxxxxxxxx <joachim.backes@xxxxxxxxxxxxxx>: >>>> Does anybody do OCR using software available in Fedora? Which ones do >>>> you use? How do you use them? >>>> I saw an article about OCRopus [1] and how great app it is but there >>>> is no ocropus in fedora currently. >>>> >>>> [1] >>>> http://arstechnica.com/news.ars/post/20071024-hands-on-with-googles-ocropus-open-source-scanning-software.html >>> >>> I use gocr-0.45-2.fc9.i386 >>> >>> I think it comes from the fedora repo. >> >> Tesseract is better: >> >> yum install tesseract >> >> Paul >> >> -- >> fedora-list mailing list >> fedora-list@xxxxxxxxxx >> To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list >> > > Hi Joachim and Paul, > do gocr and tesseract have GUIs? How are you using them? Do you get > formated text or just plain text file? Do gocr and tesseract recognise > colums? Is it possible to get formated OpenOffice Writer document that > matches the original scanned page? > > I read the article I posed the link to about OCRopus and it seams that > uses tesseract but it somehow improved. > > Cheers, > Valent. I've used both gocr and tesseract on the same text. gocr has a gui, tesseract is only command line. I've used both tools on various tiff files. There is a good writeup on the net, forget where on using tesseract on ubuntu. I got much better text recognition with tesseract from the same original scanned text. Never tried ocropus. gustav -- fedora-list mailing list fedora-list@xxxxxxxxxx To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list