gImageReader Manual
Contents
gImageReader is a lightweight frontent to tesseract-ocr written in Python using the GTK+ bindings.
Main features:
- Allows the user to select the part of the image they want to be recognized or directly recognize the entire image.
- Supports PDF documents.
- Allows the user to acquire images from scanning devices.
- Recognized text displayed directly next to the image.
- Basic editing of output text, including search/replace and removing line breaks on selected text.
- Spellcheck enabled for the selected language in the output textfield if corresponding dictionary installed.
- User is prompted to install missing spellcheck languages (requires PackageKit or apt-file).
- Easily switch between multiple open files.
- Attempts to automatically detect all necessary programs, otherwise shows a configuration prompt to the user, see Configuration for details.
- Dependencies: gImageReader depends on the following third-party programs:
- tesseract-ocr: this is the OCR engine which gImageReader uses for performing text recognition.
- Installation:
- Linux: If you installed the program from a deb or rpm package, the dependencies were automatically installed. If you installed the program from source, you can install them through the distributions package manager.
- Windows: You can download the programs from the following links: tesseract.
- Path configurations: gImageReader attempts to autodetect the necessary paths. If it does not succeed, you must enter them manually - for more information hover over the help icons next to the input entries with the mouse.
Main points:
- Supported file formats are currently JPEG, GIF, PNG, TIFF and PDF. For best results, the image resolution should usually be between 200 and 300 dpi for normal, 10-12 pt text.
- Images can be acquired from the scanners by chosing the Acquire Image button. For this functionality to be available, python-imaging-sane (Linux) resp. the python twain bindings must be installed (Windows).
- The program attempts to automatically find all required third-party programs by searching common paths as well as the PATH environment variable. These paths, as well as other options, can be customized via File→Configure.
- The language drop-down menu in the main toolbar sets the language for the following recognitions as well as the spellcheck language for the output textarea.
- To help reformatting the text, the program offers find/replace as well as a "strip line breaks" functionality (found in the toolbar above the output textarea) to automatically remove line breaks according to some criteria, by default all line breaks except those preceded by a dot. The criteria can be customized using the menu next to the button.
- Some tesseract languages are not detected
Languages are searched based on a list of known ones (unfortunately the tesseract language data files do not provide enough information for automating the procedure). If you wish to use a language that is not included in the default list, you must add a corresponding entry under File→Configure→Languages.
- Where is the program configuration stored?
The configuration is stored under $HOME/.config/gimagereader on UNIX type platforms and under %APPDATA%\gimagereader on Windows.
- Spellcheck does not work!
If you are using Windows, GTKSpell is now bundled with the program - if you are using your own set of GTK libraries, you must compile it on your own. On UNIX type platforms, GTKSpell should be easy to install through your distribution's package management system - the necessary dictionaries might however not be installed.
- The program fails to recognize my image!
Tesseract sometimes fails in a quite ugly fashion when attempting to perform OCR on an image. There usually are two type of failures, either bad file format or a crash in the recognition process itself. Concerning the first type, gImageReader tries it's best to pass the image to tesseract in the exact format it accepts (i.e. uncompressed TIFF), if you encounter such an error please contact me with the misbehaving image. Concerning the second type of errors, there is nothing gImageReader can do about those, one can attempt to retry by varying the image in some way.
- Linux distributions using PackageKit: the program should automatically offer to install missing spelling dictionaries.
- Debian based Linux not using PackageKit: if you are using a debian based distro and have apt-file installed, the program should automatically offer to install missing spelling dictionaries. Otherwise use synaptic to install myspell / hunspell dictionaries.
- Other Linux distributions: use the local package manager to install myspell / hunspell spelling dictionaries.
- Windows: download the desired spelling dictionary from http://wiki.services.openoffice.org/wiki/Dictionaries, and extract the *.dic and *.aff files to Start→All Programs→gImageReader→Spelling dictionaries.
Some troubleshooting notes when running the program on Windows:
- Nothing happens: Have a look at C:\Program Files\gimagereader\gimagereader.exe.log (or similar if you installed somewhere else), it might contain some valuable information. The typical problem is that GTK is not installed on the system and it was chosen to not install the bundled GTK along with gImageReader.
- Some icons are missing: if you used your own GTK installation, check you have the gnome icon theme installed and configured in the etc/gtk-2.0/gtkrc file.
- Scanning fails: the Python TWAIN module (or maybe TWAIN itself?) behaves in a quite odd way some times, up to the point where it manages to crash the python interpreter itself. Usually the most robust way to acquire images is using the WIA drivers (as opposed to the TWAIN drivers), the devices are usually denoted accordingly in the devices list. While I did my best to implement TWAIN support according to the provided documentation, I am still looking forward to improve it's robustness in the future - any hints are welcome!
For contributions of any kind, bug reports etc. please contact me at manisandro@gmail.com. I'd especially appreciate translations - here are the main steps for creating a translation:
- Create a new translation: edit localize.sh and append the new language code to the LANGS variable
- Update the translations: run "./localize.sh update" (without quotes)
- Edit the po files in po/
- Test the translation: run "./localize.sh compile" (without quotes) and run bin/gimagereader
- Submit the translation: please send the po file to manisandro@gmail.com, thanks!
Copyright ©2009-2011 Sandro Mani, revision: Fri, 31 Dec 2010 22:37:04 +0100