ISSN (Print) : 2320 – 3765
ISSN (Online): 2278 – 8875
International Journal of Advanced Research in Electrical,
Electronics and Instrumentation Engineering
(An ISO 3297: 2007 Certified Organization)
Website: www.ijareeie.com
Vol. 6, Issue 4, April 2017
Copyright to IJAREEIE DOI:10.15662/IJAREEIE.2017.0604015 2291
Implementation of Image to Text Conversion
using Android App
Ishita Pal
1
, Mohammadraza Rajani
2
, Anusha Poojary
3
, Priyanka Prasad
4
B. E Student, Dept. of ECE, SIES Graduate School of Technology, Nerul, Navi Mumbai, Maharashtra, India
1
B. E Student, Dept. of ECE, SIES Graduate School of Technology, Nerul, Navi Mumbai, Maharashtra, India
2
B. E Student, Dept. of ECE, SIES Graduate School of Technology, Nerul, Navi Mumbai, Maharashtra, India
3
B. E Student, Dept. of ECE, SIES Graduate School of Technology , Nerul, Navi Mumbai, Maharashtra, India
4
ABSTRACT: The paper aims to recognize the image and translate it into an editable text using Optical Character
Recognition (OCR) method through an android app. This paper presents a efficient use of the android platform to
extract the text from an already existing image as well as from any real time image, providing the user with multiple
time cropping option for expeditious recognition of text.. The android app developed will have no remote computing
as it is developed using Tesseract, an OCR engine which along with all image processing suite, is installed in the
android app.
KEYWORDS: OCR, android, Tesseract, cropping.
.
I.INTRODUCTION
As we read the words, our eyes and brain continuously carry out optical character recognition in such a way that we are
not able to recognize it. Our eyes are recognizing the luminous pattern of printed character and our brain is using this to
figure out what we are trying to say. Apart from humans , nowadays even the computer are capable of performing this
task using the technique called OCR. OCR helps in bringing the text available in analog format into a digital form.[2]
Nowadays many organizations are depending on OCR systems to eliminate the human interactions for better
performance and efficiency.[1]
The objective of the paper is to utilize this feature of the computer through an android app. This visual capability is
brought out using a android mobile phone working on Tesseract OCR engine. The android app provides the user to
recognize the text from either an image stored in the gallery , image taken with a camera, from a stored document in
mobile or allows to store a name of the locations from the map application available in mobiles. This app can be used
for automatic number plate recognition, extracting business card information into the contact list, Automatic insurance
documents key information extraction, the converted text can then be fed to the text to speech application and can be
used as a assertive technology for visually impaired users.
II.LITERATURE REVIEW
Shalin A. Chopra[1] This paper tells about OCR system for offline handwritten character recognition. Preprocessing
techniques used in document images as an initial step in character recognition systems were presented. The feature
extraction step of optical character recognition is the most important. It can be used with existing OCR methods,
especially for English text.
Dishank Rajesh Palan [2] In this paper it presents an android application for accurate recognition and translation of text
in varying environmental conditions, given an Android mobile having a camera.
ISSN (Print) : 2320 – 3765
ISSN (Online): 2278 – 8875
International Journal of Advanced Research in Electrical,
Electronics and Instrumentation Engineering
(An ISO 3297: 2007 Certified Organization)
Website: www.ijareeie.com
Vol. 6, Issue 4, April 2017
Copyright to IJAREEIE DOI:10.15662/IJAREEIE.2017.0604015 2292
Jin Jin[3] In this paper, OCR technology was applied to build a flash card Android application for memorizing new
words. It integrates the Tesseract OCR engine into the application, as it is open-source and free to use, released under
the Apache License.
Line Eikvil[4] This paper presents a review on OCR techniques. It also tells about the OCR process that converts text,
present in digital image, to editable text and how it recognizes characters through optical mechanisms.
N. Venkata Rao[5] In this paper a large number of methods of optical character recognition are presented. It analyses
the advantages and drawbacks of various OCR methods and also proposes a modified back propagation method. The
proposed method computes error rate efficiently, it results in increasing the accuracy.
Chirag Patel[6] In this paper, it provides information about Optical Character Recognition (OCR) method, history of
Open Source OCR tool Tesseract, architecture of it and experiment result of OCR performed by Tesseract on different
kinds images
Sonia Bhaskar [7] This report presents an algorithm for accurate recognition of text on a business card, given an
Android mobile phone camera image of the card in varying environmental conditions such as variable lighting,
reflection, rotation, and scaling, among others
José C. Principe[8] This report unifies the concepts of neural networks and adaptive filters into a common framework.
It begins by explaining the fundamentals of adaptive linear regression and builds on the concepts to explore pattern
classification, function approximation, feature extraction, and time-series modeling/prediction
Rohit Verma[9] This paper presents an overview of feature extraction methods for recognition of
segmented(isolated)characters. Selection of feature extraction method is probably the single most important fact or in
achieving high recognition performance in character recognition systems. Different feature extraction methods are
designed for different presentations of the characters
Ms. M. Shalini[10]This paper presents a brief survey of earlier research work related to all Indian languages. A brief
history of OCR, various approaches to character recognition along with some applications of character recognition is
also discussed in this paper
Richa Goswami[11]This paper presents detailed review in the field of Optical Character Recognition. Various
techniques are determine that have been proposed to realize the center of character recognition in an optical character
recognition system. .
Pranob K Charles[12] In this paper various approaches used for the design of OCR systems are discussed.It presents
the techniques that are slow which provide better results in nature and also the fast techniques that provide inefficient
results. In this it is found that the OCR techniques based on neural network provide more accurate results than other
techniques.
III.STRUCTURAL MODEL
.Fig 1.- Block diagram
The above figure Fig 1. shows the basic steps involved in recognising the text from an image using an android app. The
Image to text recognition consist of first developing an android app in android studio for loading and cropping of
image. The GUI provides two options for user -first is to loading an image and second is cropping of image. As shown
in Fig 2.
ISSN (Print) : 2320 – 3765
ISSN (Online): 2278 – 8875
International Journal of Advanced Research in Electrical,
Electronics and Instrumentation Engineering
(An ISO 3297: 2007 Certified Organization)
Website: www.ijareeie.com
Vol. 6, Issue 4, April 2017
Copyright to IJAREEIE DOI:10.15662/IJAREEIE.2017.0604015 2293
Fig 2.- Home screen of App
The user uses the android GUI for capturing or loading an image ,from which the text needed to be extracted. The user
can load the image from gallery intent or camera intent. So, on clicking the load image the user is provided with the
option of all the apps installed in the mobile having some image stored along with the option of camera. In this app the
user is given the freedom of choosing the portion of the image to be converted by changing the cropping area ,which
can be done by dragging the edges of the cropping box as shown in the Fig 4.. Once the crop button is pressed the
image is send to the Tesseract OCR engine module.
Fig 4.- Selecting cropping area
ISSN (Print) : 2320 – 3765
ISSN (Online): 2278 – 8875
International Journal of Advanced Research in Electrical,
Electronics and Instrumentation Engineering
(An ISO 3297: 2007 Certified Organization)
Website: www.ijareeie.com
Vol. 6, Issue 4, April 2017
Copyright to IJAREEIE DOI:10.15662/IJAREEIE.2017.0604015 2294
Tesseract is an offline and open source OCR engine. Tesseract treats the input image as polygonal binary area. The
image is converted to bitmap by locally adaptive Otsu threshold binarization. This allows to define single locally
adapted threshold to be defined for each tile, which produces noisy binarization. Thus, it is followed by smoothing to
reduce the noise. The words are segmented into characters and Sobel edge detection is performed to extract the edges.
The extracted data is matched with the templates stored in the system and the matched character is displayed on the
android GUI screen. The below figure Fig 5.shows the text extracted from the selected region.
Fig 5.- Displaying the converted text
The below figure Fig 6 shows that the user is further provided with the option of limiting their extracting text process
by further minimising their cropping window, that is the user can make the cropping window smaller to recognise a
single line rather than a paragraph .
Fig 6.- Multiple cropping Option
ISSN (Print) : 2320 – 3765
ISSN (Online): 2278 – 8875
International Journal of Advanced Research in Electrical,
Electronics and Instrumentation Engineering
(An ISO 3297: 2007 Certified Organization)
Website: www.ijareeie.com
Vol. 6, Issue 4, April 2017
Copyright to IJAREEIE DOI:10.15662/IJAREEIE.2017.0604015 2295
The below figure Fig 7. shows the result of recognising the text from the image after the user has changed the cropping
window size. The text displayed can be copied and edited as per the requirement of the user.
Fig 7.- Result of multiple cropping
IV.CONCEPTUAL MODEL
A typical OCR system consists of several components. In figure 8 common setup is illustrated
Fig 8. - OCR phases [4]
The first step in the process is to digitize the analog document using an optical scanner. When the regions containing
text are acquired, each symbol is extracted through a segmentation process. The extracted symbols may then be
preprocessed, for elimination of noise, to facilitate the extraction of the text features in the next step. The identity of
each symbol is found by comparing the extracted features with descriptions of the symbol classes obtained through a
previous learning phase. Finally contextual information is used to reconstruct the words and numbers of the original
text. In the next sections these steps and some of the methods involved are described in more detail.
ISSN (Print) : 2320 – 3765
ISSN (Online): 2278 – 8875
International Journal of Advanced Research in Electrical,
Electronics and Instrumentation Engineering
(An ISO 3297: 2007 Certified Organization)
Website: www.ijareeie.com
Vol. 6, Issue 4, April 2017
Copyright to IJAREEIE DOI:10.15662/IJAREEIE.2017.0604015 2296
Optical scan: In this phase the digital form of document is created. The scanning unit consist of a transportation
mechanism and a sensing unit to convert the light intensity received into gray level. The image captured is firstly
converted to binary format with the help of thresholding. [6]
Location and Segmentation : This phase is used to recognize the constitute of the image. Segmentation process helps
to distinguish between the text part in the image from the graph and other non text part present.[6]
Preprocessing: The image resulting after the scanning procedure may contain some noise due to the scanner or the
technique applied for thresholding. This noise may cause broken letter which may hamper the text recognition
accuracy. Thus, in this phase we remove the noise which is also known as smoothing of digital image.[6]
Feature Extraction: The Image is then matched with the templates preloaded in the system and the template with the
highest correlation is selected and declared as the character.[1]
Post Processing: After the extraction stage if there is any word which is unrecognized then the word of letter is given a
meaning in this stage .This can be done by importing extra template into the system.[2]
2.Tesseract
Fig 9.- Tesseract phases [6]
The Tesseract phases as shown in the above figure Fig 9 are:
Input image: The image given as the input is a gray or rgb image. The input image must be a flat image that is a
parallel image capture. It doesn`t have any capability to rectify the errors caused due to perspective distortion.[7]
Adaptive thresholding: It converts the gray scale image to binary image and calculates the optimal threshold so that
there is minimal variance difference between the background and foreground pixels.[7]
Connected component analysis: It searches for the foreground image and treats them as blob. Blob refers to the
region in the digital image which differ in comparison to the surrounding due to different colour or brightness[3]
Line finding algorithm: Lines are found by analysing the image space adjacent to the potential character. If the pixel
count is below a specified threshold level then it is detected as line.[7]
character recognition : It finds the baseline of the text to approximate the height of the character. Then the character
width is approximated. If the characters are not having the same width then it is processed in an alternate manner.
Word recognition: After all characters have been extracted it recognizes word line by line and then passes through a
contextual and syntactical analyser for proper recognition. [7]
ISSN (Print) : 2320 – 3765
ISSN (Online): 2278 – 8875
International Journal of Advanced Research in Electrical,
Electronics and Instrumentation Engineering
(An ISO 3297: 2007 Certified Organization)
Website: www.ijareeie.com
Vol. 6, Issue 4, April 2017
Copyright to IJAREEIE DOI:10.15662/IJAREEIE.2017.0604015 2297
V. RESULTS
We have successfully extracted the text from an image by cropping the image as shown in figure 5. We have
implemented multiple cropping (refer Fig.6) and extracted text from it (refer Fig 7)
VI.CONCLUSION
This paper provides a detailed discussion about offline image to text recognition through an android app. The image is
loaded into the Android app and the users are provided the choice to select the part of image to be converted, Then the
image is processed by OCR technique to produce the converted text on screen. The concepts involved can further be
used to boost the future technology like handwriting recognition or recognition of many more languages and even for
translation purpose.
REFERENCES
1. Shalin A. Chopra, Amit A. Ghadge, Onkar A. Padwal “Optical Character Recognition” International Journal of Advanced Research in
Computer and Communication Engineering, Vol. 3, Issue 1, January 2014.
2. Dishank Rajesh Palan, Ghoshil Bharat Bhatt, Kinjal Jayesh Mehta, Kunal Jayesh Shavdia, Mansi Kambli “OCR on Android-
Travelmate”International Journal of Advanced Research in Computer and Communication Engineering ,Vol. 3, Issue 3, March 2014.
3. Jin Jin, A Flash Card Android Application Development Applied with OCR technology,” Helsinki Metropolia University of Applied Sciences
, Thesis 26 May 2014
4. Line Eikvil," OCR - Optical Character Recognition" December 1993
5. N. Venkata Rao, Dr. A.S.C.S.Sastry, A.S.N.Chakravarthy, Kalyan Chakravarthi “optical character recognition technique algorithms”, Journal
of Theoretical and Applied Information Technology , Vol.83. No.2,20th January 2016.
6. Chirag Patel, Atul Patel, Dhamendra Patel, "Optical Character Recognition by Open Source OCR Tool Tesseract A Case Study ", International
Journal of Computer Applications Volume 55– No.10, October 2012
7. Sonia Bhaskar, Nicholas Lavassar, Scott Green, "Implementing Optical Character Recognition on the Android Operating Systems for Business
Cards" EE 368 Digital Image Processing
8. José C. Principe, Neil R. Euliano, Curt W. Lefebvre “Neural and Adaptive Systems: Fundamentals Through Simulations”, ISBN 0-471-35167-9
9. Rohit Verma and Dr. Jahid Ali, A-Survey of Feature Extraction and Classification Techniques in OCR Systems”, International Journal of
chapter2
10. Ms. M. Shalini, Dr. B. Indira, “Automatic Character Recognition of Indian Languages – A brief Survey”, IJISET, Vol. 1, Issue 2, April 2014
11. Richa Goswami and O.P. Sharma, “A Review on Character Recognition Techniques”, IJCA, Vol. 83, No. 7, December 2013.
12. Pranob K Charles, V.Harish, M.Swathi, CH. Deepthi “A Review on the Various Techniques used for Optical Character Recognition”,
International Journal Engineering Research and Applications,Vol.2, Jan-Feb 2012.