Scanning in an image and converting it to text is relatively straightforward in Linux provided you have the correct software installed. I plumped for Tesseract as it was reputedly the best command line OCR program but I also wanted to have a graphical user interface with it so I used gImageReader as a front-end to Tesseract.
Here’s how to install both of them for Optical Character Recognition in Ubuntu.
Firstly, install tesseract (and the associated language files if needed):
sudo apt-get install tesseract-ocr
Install a language file (e.g. -eng, -deu, -fra, -ita, -ndl, -por, -spa, …)
sudo apt-get install tesseract-ocr-eng
Next, install gImageReader as a frontend to tesseract.
Add the application repository:
sudo add-apt-repository ppa:sandromani/gimagereader
Update the repository sources
sudo apt-get update
Install the application
sudo apt-get install gimagereader
Now you should be ready to go. gImageReader can be accessed on your graphics menu. Happy Character Recognising!
Published on Sat 23 April 2016 by Gary Hall in Linux with tag(s): ubuntu tesseract gimagreader ocr