Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Elétrica, 2008.; Este trabalho objetiva propor uma arquitetura de um sistema para tratamento e reconhecimento automático do texto de documentos paleográficos, utilizando um OCR
(Optical Character Recognition) com tecnologia de redes neurais artificiais. O sistema proposto deve atuar no contexto de processos de transcrição do texto de documentos de escritas paleográficas do século XVI ao XIX, documentos estes do Brasil colônia que foram digitalizados a partir dos originais impressos arquivados no Arquivo Ultramarino de Lisboa, uma das realizações do Projeto Resgate do Ministério da Cultura brasileiro. A arquitetura do sistema proposto inclui módulos para segmentar as imagens digitalizadas dos documentos, para análise dos segmentos com OCR na tentativa de reconhecimento do texto, para treinamento do OCR com formação de um dicionário de palavras reconhecidas
e para armazenamento do texto transcrito a partir das imagens dos documentos.
Para avaliar essa arquitetura foi desenvolvido um protótipo de software que permite ao usuário segmentar manualmente uma imagem de documento, treinar um OCR simples e extrair com esse OCR algumas informações de texto do documento paleográfico digitalizado. Conclui-se que a arquitetura proposta é funcional...
Feature extraction is one of the fundamental problems of character
recognition. The performance of character recognition system is depends on
proper feature extraction and correct classifier selection. In this article, a
rapid feature extraction method is proposed and named as Celled Projection (CP)
that compute the projection of each section formed through partitioning an
image. The recognition performance of the proposed method is compared with
other widely used feature extraction methods that are intensively studied for
many different scripts in literature. The experiments have been conducted using
Bangla handwritten numerals along with three different well known classifiers
which demonstrate comparable results including 94.12% recognition accuracy
using celled projection.; Comment: 5 pages, 1 figure
Text image super-resolution is a challenging yet open research problem in the
computer vision community. In particular, low-resolution images hamper the
performance of typical optical character recognition (OCR) systems. In this
article, we summarize our entry to the ICDAR2015 Competition on Text Image
Super-Resolution. Experiments are based on the provided ICDAR2015 TextSR
dataset and the released Tesseract-OCR 3.02 system. We report that our winning
entry of text image super-resolution framework has largely improved the OCR
performance with low-resolution images used as input, reaching an OCR accuracy
score of 77.19%, which is comparable with that of using the original
high-resolution images 78.80%.; Comment: 5 pages, 8 figures
Optical Character Recognition deals in recognition and classification of
characters from an image. For the recognition to be accurate, certain
topological and geometrical properties are calculated, based on which a
character is classified and recognized. Also, the Human psychology perceives
characters by its overall shape and features such as strokes, curves,
protrusions, enclosures etc. These properties, also called Features are
extracted from the image by means of spatial pixel-based calculation. A
collection of such features, called Vectors, help in defining a character
uniquely, by means of an Artificial Neural Network that uses these Feature
Vectors.; Comment: Signal & Image Processing : An International Journal (SIPIJ) Vol.3,
No.5, October 2012
A novel approach for recognition of handwritten compound Bangla characters,
along with the Basic characters of Bangla alphabet, is presented here. Compared
to English like Roman script, one of the major stumbling blocks in Optical
Character Recognition (OCR) of handwritten Bangla script is the large number of
complex shaped character classes of Bangla alphabet. In addition to 50 basic
character classes, there are nearly 160 complex shaped compound character
classes in Bangla alphabet. Dealing with such a large varieties of handwritten
characters with a suitably designed feature set is a challenging problem.
Uncertainty and imprecision are inherent in handwritten script. Moreover, such
a large varieties of complex shaped characters, some of which have close
resemblance, makes the problem of OCR of handwritten Bangla characters more
difficult. Considering the complexity of the problem, the present approach
makes an attempt to identify compound character classes from most frequently to
less frequently occurred ones, i.e., in order of importance. This is to develop
a frame work for incrementally increasing the number of learned classes of
compound characters from more frequently occurred ones to less frequently
occurred ones along with Basic characters. On experimentation...
We propose an end-to-end recurrent encoder-decoder based sequence learning
approach for printed text Optical Character Recognition (OCR). In contrast to
present day existing state-of-art OCR solution which uses connectionist
temporal classification (CTC) output layer, our approach makes minimalistic
assumptions on the structure and length of the sequence. We use a two step
encoder-decoder approach -- (a) A recurrent encoder reads a variable length
printed text word image and encodes it to a fixed dimensional embedding. (b)
This fixed dimensional embedding is subsequently comprehended by decoder
structure which converts it into a variable length text output. Our
architecture gives competitive performance relative to connectionist temporal
classification (CTC) output layer while being executed in more natural
settings. The learnt deep word image embedding from encoder can be used for
printed text based retrieval systems. The expressive fixed dimensional
embedding for any variable length input expedites the task of retrieval and
makes it more efficient which is not possible with other recurrent neural
network architectures. We empirically investigate the expressiveness and the
learnability of long short term memory (LSTMs) in the sequence to sequence
learning regime by training our network for prediction tasks in segmentation
free printed text OCR. The utility of the proposed architecture for printed
text is demonstrated by quantitative and qualitative evaluation of two tasks --
word prediction and retrieval.; Comment: 9 pages (including reference)...
Intensive research has been done on optical character recognition ocr and a
large number of articles have been published on this topic during the last few
decades. Many commercial OCR systems are now available in the market, but most
of these systems work for Roman, Chinese, Japanese and Arabic characters. There
are no sufficient number of works on Indian language character recognition
especially Kannada script among 12 major scripts in India. This paper presents
a review of existing work on printed Kannada script and their results. The
characteristics of Kannada script and Kannada Character Recognition System kcr
are discussed in detail. Finally fusion at the classifier level is proposed to
increase the recognition accuracy.; Comment: 12 pages, 8 figures
HOCR is abbreviated as Handwritten Optical Character Recognition. HOCR is a
process of recognition of different handwritten characters from a digital image
of documents. Handwritten automatic character recognition has attracted many
researchers all over the world to contribute handwritten character recognition
domain. Shape identification and feature extraction is very important part of
any character recognition system and success of method is highly dependent on
selection of features. However feature extraction is the most important step in
defining the shape of the character as precisely and as uniquely as possible.
This is indeed the most important step and complex task as well and achieved
success by using invariance property, irrespective of position and orientation.
Zernike moments describes shape, identify rotation invariant due to its
Orthogonality property. MODI is an ancient script of India had cursive and
complex representation of characters. The work described in this paper presents
efficiency of Zernike moments over Hu 7 moment with zoning for automatic
recognition of handwritten MODI characters. Offline approach is used in this
paper because MODI Script was very popular and widely used for writing purpose
till 19th century before Devanagari was officially adopted.; Comment: This paper has been withdrawn by the author due to the paper was
rejected by journal with a reson "paper was not suitable for the journal"
This report explores the latest advances in the field of digital document
recognition. With the focus on printed document imagery, we discuss the major
developments in optical character recognition (OCR) and document image
enhancement/restoration in application to Latin and non-Latin scripts. In
addition, we review and discuss the available technologies for hand-written
document recognition. In this report, we also provide some company-accumulated
benchmark results on available OCR engines.; Comment: Technical report surveying OCR/ICR and document understanding methods
as of 2004.It contains 38 pages, numerous figures, 93 references, and
provides a table of contents
This paper presents a complete Optical Character Recognition (OCR) system for
camera captured image/graphics embedded textual documents for handheld devices.
At first, text regions are extracted and skew corrected. Then, these regions
are binarized and segmented into lines and characters. Characters are passed
into the recognition module. Experimenting with a set of 100 business card
images, captured by cell phone camera, we have achieved a maximum recognition
accuracy of 92.74%. Compared to Tesseract, an open source desktop-based
powerful OCR engine, present recognition accuracy is worth contributing.
Moreover, the developed technique is computationally efficient and consumes low
memory so as to be applicable on handheld devices.
The problem of optical character recognition, OCR, has been widely discussed
in the literature. Having a hand-written text, the program aims at recognizing
the text. Even though there are several approaches to this issue, it is still
an open problem. In this paper we would like to propose an approach that uses
K-nearest neighbors algorithm, and has the accuracy of more than 90%. The
training and run time is also very short.
This paper assumes the hypothesis that human learning is perception based,
and consequently, the learning process and perceptions should not be
represented and investigated independently or modeled in different simulation
spaces. In order to keep the analogy between the artificial and human learning,
the former is assumed here as being based on the artificial perception. Hence,
instead of choosing to apply or develop a Computational Theory of (human)
Perceptions, we choose to mirror the human perceptions in a numeric
(computational) space as artificial perceptions and to analyze the
interdependence between artificial learning and artificial perception in the
same numeric space, using one of the simplest tools of Artificial Intelligence
and Soft Computing, namely the perceptrons. As practical applications, we
choose to work around two examples: Optical Character Recognition and Iris
Recognition. In both cases a simple Turing test shows that artificial
perceptions of the difference between two characters and between two irides are
fuzzy, whereas the corresponding human perceptions are, in fact, crisp.; Comment: 5th Int. Conf. on Soft Computing and Applications (Szeged, HU), 22-24
This thesis investigates a method for using contextual information in
text recognition. This is based on the premise that, while reading, humans
recognize words with missing or garbled characters by examining the
surrounding characters and then selecting the appropriate character. The
correct character is chosen based on an inherent knowledge of the language
and spelling techniques. We can then model this statistically.
The approach taken by this Thesis is to combine feature extraction
techniques, Neural Networks and Hidden Markov Modeling. This method of
character recognition involves a three step process: pixel image
preprocessing, neural network classification and context interpretation.
Pixel image preprocessing applies a feature extraction algorithm to
original bit mapped images, which produces a feature vector for the original
images which are input into a neural network.
The neural network performs the initial classification of the characters
by producing ten weights, one for each character. The magnitude of the
weight is translated into the confidence the network has in each of the
choices. The greater the magnitude and separation, the more confident the
neural network is of a given choice.
The output of the neural network is the input for a context interpreter.
The context interpreter uses Hidden Markov Modeling (HMM) techniques to
determine the most probable classification for all characters based on the
characters that precede that character and character pair statistics. The
HMMs are built using an a priori knowledge of the language: a statistical
description of the probabilities of digrams.
Experimentation and verification of this method combines the
development and use of a preprocessor program...
I applied clustering analysis to the problem of creating tagged training data for optical
character recognition (OCR). The creation of labeled character data by hand is a slow and
cumbersome process. My belief is that clustering methods can be applied to character data before
tagging it, allowing the operator to label entire groups of characters at once and greatly speeding
the time in which tagged character data can be generated. This thesis will provide proof of
concept as a basis for more in depth research and eventually the creation of a sophisticated
application utilizing these techniques for the generation of labeled training data for OCR
The Archimedes Palimpsest is a manuscript containing the partial text of seven treatises
by Archimedes that were copied onto parchment and bound in the tenth-century AD.
This work is aimed at providing tools that allow scholars of ancient Greek mathematics
to retrieve as much information as possible from images of the remaining degraded
text. Acorrelation pattern recognition (CPR) system has been developed to recognize distorted
versions of Greek characters in problematic regions of the palimpsest imagery,
which have been obscured by damage from mold and fire, overtext, and natural aging.
Feature vectors for each class of characters are constructed using a series of spatial
correlation algorithms and corresponding performance metrics. Principal components
analysis (PCA) is employed prior to classification to remove features corresponding to
filtering schemes that performed poorly for the spatial characteristics of the selected
region-of-interest. A probability is then assigned to each class, forming a character
probability distribution based on relative distances from the class feature vectors to
the ROI feature vector in principal component (PC) space. However, the current CPR
system does not produce a single classification decision...
This thesis investigates a character recognition method inspired
by the premise that humans recognize shapes using their ability to
assimilate a set of primitive features. These features collectively
create a higher level shape of a certain category. The primitive
features employed in our method include horizontal, vertical, diagonal
lines, and corners of various orientations positioned at various places
within a character. Combinations of these features form categories of
characters to be recognized. The basic approach consists of
preprocessing a character bitmap, extracting primitive features to form
a feature vector. The feature vector is then input to a classification
neural net. Based on weights derived during training, the system selects
the character most closely identified by the feature vector. The
advantages of this approach are the speed of training and recognition
(as opposed to methods which continually iterate to the final solution),
and robustness of the "blurring" effect realized by transforming a
character bitmap to an array of features, rather than attempting
template matching at the bitmap or pixel level.
To support this study, a graphics workstation based environment
has been developed, equiped with 3000 16X16 pixel characters...
Pattern recognition/classification is increasingly drawing the attention of scientific research
because of its important roll in automation and human-machine communication. Even
though many models have been introduced to deal with classification, because of the
inherited imprecision and ambiguity, these models did not tackle the problem in an
efficient way. Traditional models deal only with statistical uncertainty (randomness) but
not with the non-statistical uncertainty (vagueness). Fuzzy set theory allows us to better
understand imprecision in both of its categories: vagueness and randomness. The
incorporation of fuzzy set theory in existing algorithms helped in many cases to improve
the performance and increase the efficiency of those algorithms.
This thesis will explore fuzzy logic as it pertains to pattern recognition. In order to
demonstrate fuzzy logic, the problem of recognizing the Arabic alphabet is discussed. In
this problem moments and central moments were used as discriminating features.
A fuzzy classifier was designed in a way that incorporated some statistical knowledge of
the problem in hand. Performance of this classifier was compared to a Bayesian classifier
and a neural network classifier. Performance, evaluation...
Handley, John; Namboodiri, Anoop; Zanibbi, Richard
Fonte: IEEE Computer Society : Eighth International Conference on Document Analysis and RecognitionPublicador: IEEE Computer Society : Eighth International Conference on Document Analysis and Recognition
Statistical moment invariants were used to generate a feature space for classifying images of text characters. The feature vector of a given letter is invariant to changes in scale, position, rotation, and contrast in the image. Test character images were generated by simulated optical blurring. Images were classified by calculating the distance between the feature vector of a given test character and that of each reference character. The test character was identified as the reference character for which the distance between feature vectors is a minimum. Significantly blurred characters were classified correctly using this method.
Knowledge extraction by just listening to sounds is a distinctive property. Speech signal is more effective means of communication than text because blind and visually impaired persons can also respond to sounds. This paper aims to develop a cost effective, and user friendly optical character recognition (OCR) based speech synthesis system. The OCR based speech synthesis system has been developed using Laboratory virtual instruments engineering workbench (LabVIEW) 7.1.