International Journal of Engineering Research in Computer Science and Engineering
(IJERCSE)
Volume 4, Issue 4, April 2017
23
Neural network mostly uses the OCR. Dileep Kumar
Patel, Tanmoy Som, Sushil Kumar Yadav ,Manoj
Kumar Singh [2012][4] gives the solution to the
problem of handwritten character recognition. It has
been tackled with multi resolution technique using
Discrete wavelet transform (DWT) and Euclidean
distance metric (EDM). The technique has been tested
and found to be more accurate and faster. Characters is
classified into 26 pattern classes based on appropriate
properties. Chi et al. (2012) [5] has proposed an
effective algorithm to deal with bleed-through effects
existing in the images of financial documents.Double-
sided images scanned simultaneously are used as
inputs, and the bleed-through effect is detected then
removed after the registration of the side images.
Satyajitsaha, Dnyaneshwar, Hagawane, Pravin
C.Kulkarni, Swapni R.Dhamane (2013)[6] proposes
the objective to recognize and extract the text from
images captured by camera based mobile device, and
once the text is recognized then information about the
text can be obtain via Dictionary or via Web. Majida
Ali Abed et al.(2013)[7] presents a new approach to
simplify Handwritten Characters Recognition based on
simulation of the behavior of schools of fish and flocks
of birds that is called the Particle Swarm Optimization
Approach (PSOA).PSOA is convergent and more
accurate in solutions that minimize the error
recognition rate. Vijay Laxmi Sahu et al(2013)[8]
explains that characteristics of the classification
methods that have been successfully applied to
character recognition and remaining problems that can
be potentially solved by learning methods. Argha Roy,
Diptam Dutta K Austav, Choudhury (2013)[9] explains
the IRIS plant classification using Neural Network.It
provides the adaptation of network weights using
Particle Swarm Optimization (PSO) was proposed as a
mechanism to improve the performance of Artificial
Neural Network (ANN) in classification of IRIS
dataset. Classification method is a machine learning
technique used to predict group membership for data
instances. Amir Bahador Bayat(2013)[10] proposes an
efficient system that includes two main modules, the
feature extraction module and the classifier module. In
the first module, seven sets of discriminative features
are extracted and used in the recognition system. In the
second module,the adaptive neuro-fuzzy inference
system is investigated. N.K.Gundu, S.M.Jadhav,
T.S.Kulkarni, A.S.Kumbhar(2014)[11] explains the
best ideas from the text extraction with the help of
character description and stroke configuration, web
context search and web mining with the help of
semantic web and synaptic web at low entropy. Faisal
Mohammad, Jyoti Anarase, Milan Shingote, Pratik
Ghanwat(2014)[12] presents an algorithm for
implementation of Optical Character Recognition
(OCR) to translate images of typewritten or
handwritten characters into electronically editable
format by preserving font properties.OCR can easily
do this by applying pattern matching algorithm. The
recognized text characters are stored in editable
format. Shalin A. Chopra, Amit A. Ghadge, Onkar A.
Padwal, KaranS. Punjabi, and Prof. Gandhali S.
Gurjar(2014) [13] presents a simple, efficient and
minimum cost approach to construct OCR for reading
any document that has fix font size and style or
handwritten style.In this the systems have the ability to
yield excellent results. It is mostly used with existing
OCR methods, especially for English text. Sravan,
ShivankuMahna, NirbhayKashyap (2015)[14]
explains that problems being faced by the developers
in using OCR as a technology on a large scale and give
the solution to that problem. This system provides
many features that require no typing, editing raw data,
quick translation, and memory utilization.Surabhi
Dusane, Monica Ahuja, Rucha Ghodke & Prathamesh
Kothawade (2016)[15]The objective in this paper is to
develop user friendly system which will extract text
from images and convert the extracted text into user
friendly language then it will convert it into audio
which describes the text more efficiently.
III. PROPOSED SYSTEM
Optical Character Recognition (OCR), is a
technology that enables you to convert different types
of documents, such as scanned paper documents, PDF
files or images captured by a digital camera into
editable and searchable data. It is the mechanical or
electronic conversion of images of typewritten or
printed text into machine encoded text. Images
captured by a digital camera differ from scanned
documents or image-only PDFs. They often have
defects such as distortion at the edges and dimmed
light, making it difficult for most OCR applications, to
correctly recognize the text. The latest version of
ABBYY Fine Reader supports adaptive recognition
technology specifically designed for processing camera
images. It offers a range of features to improve the
quality of such images, providing you with the ability
to fully use the capabilities of your digital devices.
A common problem faced by travelers is that of
understanding unfamiliar language. Failing to
understand unknown languages, when travelling can
lead to minor problems. These systems are usually
composed of two subsytems that perform text
extraction and text translation respectively. The
extraction and translation parts are relatively well
developed and there exist a large variety of software
packages or web services that perform these tasks. The
challenge is with extracting the exact text from the