Free OCR Tool

November 14th, 2006 by wd5gnr

When HP exited the optical character recognition business in 1995, its Tesseract OCR engine was released to UNLV as open source. In January, developers (including some from Google) decided Tesseract was stable enough to “re-release” as an open source project.

Don’t expect too much. You’ll need to compile the source code (if you use Cygwin, be prepared to copy /usr/include/limits.h to /usr/include/linux/limits.h or fix the source). And then you get a command line tool that reads single-column TIFF files from the command line. But the accuracy is much better than most of the cheap OCR tools out there.

Try it here.

Add This! BlogLines del.icio.us Digg Diigo DZone Facebook Google Google Reader Yahoo! MyWeb Netscape Netvouz reddit SlashDot Sphere StumbleUpon Technorati

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.