YAGF – Front End of Tesseract OCR in Linux Open Source. YAGF is a graphical interface for cuneiform and tesseract text recognition tools on the Linux platform. This guide will inform you how to install YAGF in linux and YAGF review briefly.
With YAGF you can scan images using software called XSane, import pages from PDF documents, perform images preprocessing and recognize texts using cuneiform from a single command centre. YAGF also makes it easy to scan and recognize several images sequentially.
YAGF – Front End of Tesseract OCR in Linux Open Source
OCR stands for Optical Character Recognition, and YAGF stands for, uh, something maybe Yet Another Graphical Frontend.
If you are using Tesseract OCR via command line Terminal, now you can use Tesseract OCR via YAGF and it has GUI window. Most people probably just want a simple utility that can scan and convert their documents and extract text, the Tesseract OCR with YAGF is the solution!
YAGF is a graphical front-end for cuneiform and tesseract OCR tools. With YAGF you can open already scanned image files or obtain new images via XSane (scanning results are automatically passed to YAGF).
Once you have a scanned image you can prepare it for recognition, select particular image areas for recognition, set the recognition language and so on. Recognized text is displayed in a editor window where it can be corrected, saved to disk or copied to clipboard.
How to Install YAGF in Linux Mint and Ubuntu
YAGF is available on linux repository, so you can install YAGF using Terminal and start by typing:
sudo apt-get install yagf
Wait until installation completed! Now you can open YAGF after install by clicking Start/Menu >> >> Office >> YAGF
YAGF Review – GUI Version of OCR Software in Linux Using Tesseract
New version of YAGF offer features which able to work with PDF files. In some cases, the files might be protected, and you might not have the option to copy text, or there might be useful information embedded inside images included in the PDF documents.
You can try online conversion tools, but perhaps YAGF can offer similar, if not better results. As a test file, I grabbed my own Linux kernel crash book, which comes with an interesting assortment of formatted text, plain-text paragraphs, as well as screenshots.
YAGF handled the 182-page document well, so this is an encouraging sign, because it means it can probably work with large data sets.
The output is, well, not as good as one might hope for. Plain text, which can just be copied and pasted, is fine. But YAGF did not handle code/command blocks and images that well. I can understand that images might pose some problem, but text boxes really shouldn’t be a challenge.