Bom OCR livre com GUI para corrigir erros? (para Windows)

9

Eu usei o SimpleOCR , que tem uma boa interface gráfica para corrigir erros. Infelizmente, comete muitos erros! (e sofre outros bugs e limitações)

Por outro lado, o Tesseract é mais preciso, mas não tem nenhuma interface gráfica.

A minha pergunta é, existe um programa gratuito de OCR para Windows que tenha uma boa interface gráfica e uma baixa taxa de erro? Quero destacar palavras suspeitas (por OCR) incerteza, não apenas a verificação ortográfica) e mostrar a palavra original (bitmap) enquanto estou editando a palavra OCR, semelhante ao que o SimpleOCR faz.

O código aberto seria o melhor, seguido por freeware, e depois por teste / demonstração / crippleware muito atrás.

    
por Hugh Allen 16.05.2010 / 01:34

3 respostas

2

Já experimentou o gimagereader , uma interface para o Tesseract?

    
por 16.08.2010 / 09:11
2

OCRopus :

The software is partly based on Tesseract, the best open source OCR engine available for now. While the project is expected to be released at the end of next year and will be used for Google's book scanning project, the team has some interesting applications in mind:

  • a web service interface
  • PDF, camera, and screen OCR
  • integration with desktop search tools: Beagle, Spotlight, Google Desktop

OCRopus(tm) is a state-of-the-art document analysis and OCR system, featuring pluggable layout analysis, pluggable character recognition, statistical natural language modeling, and multi-lingual capabilities.

The OCRopus engine is based on two research projects: a high-performance handwriting recognizer developed in the mid-90's and deployed by the US Census bureau, and novel high-performance layout analysis methods.

OCRopus is development is sponsored by Google and is initially intended for high-throughput, high-volume document conversion efforts. We expect that it will also be an excellent OCR system for many other applications. alt text Links:

  • OCRopus

  • Compilando nas janelas

GOCR

GOCR is an OCR (Optical Character Recognition) program, developed under the GNU Public License. It converts scanned images of text back to text files. Joerg Schulenburg started the program, and now leads a team of developers. GOCR can be used with different front-ends, which makes it very easy to port to different OSes and architectures. It can open many different image formats, and its quality have been improving in a daily basis. alt textalt text

Links:

por 27.09.2010 / 11:29
0

Há também o TOPOCR (a.k.a SnapReader), contendo o verificador ortográfico de pós-processamento para 11 idiomas:

SnapReader can be used to make your own searchable notes from almost any document image. Or you can use it as an an authoring tool and create your own editable content using your scanner or camera and save the results as HTML or PDF. SnapReader can also transform text into very high quality audio using Audrey. So not only can you use your scanner or camera to capture documents, you can now also use your portable music player or smartphone to "read" them.

    
por 13.10.2010 / 13:47

Tags