Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 69 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 69 - in Document Image Processing

Image of the Page - 69 -

Image of the Page - 69 - in Document Image Processing

Text of the Page - 69 -

J. Imaging 2018,4, 6 We can see from thedisplayed results inTable 6 that the computation cost of ourdeveloped holistic system is verypractical. With lexical reduction,wemanaged to reduce the run timebya factorof1000andaonepagewithaveragenumberof250wordscanbecomputed inaverage time of 1.2 s compared to1 s/page forSakhr system, 2.3 s/page forNovoDynamicsand3.5 s/page for ABBYsystem. 7.ConclusionsandFutureWork Theholisticapproachesprovideeffectivesolutionsforthechallengesofcursivescriptsrecognition suchasArabicOCR.Themaindrawbackofsuchapproaches is its complexityandheavycomputation requirementespecially for largevocabulary tasks. In thispaper,we introducedaholisticArabicOCR approach that is computationally efficient. A lexicon reduction techniquebasedon clustering the similar shapewords isutilized to reduce thewordrecognition time. Thepresentedsystemmakes useofahybridof severalholistic features that combineglobalword levelDCTbased featuresand localblockbasedfeatures.Usingthis typeof features, thesystemachievedOmni-fontperformance withsizeandfont independence.Also, thesuggestedsystemhasaflexiblearchitecture to integrate languagemodellingconstraintsbyusingasecondrescoringpass for the topn-bestwordhypotheses. Theproposedsystemhasbeentestedusingdifferent setsof1152wordswith threedifferent fonts and four font sizes andhasachieved99.3%WRR. It alsohasbeen testedusing sets of 2730words of recent computerized book’s text and has attained more than about 84.8% WRR. Results of the holistic proposed systemhave been comparedwith known commercialArabicOCR systems providedbythe largest internationalandlocalcompanies,andtheresultswerepromising. In future work,wewill investigateotherholistic features likeWaveletTransform,ZernikeTransform,Hough Transformand loci. Also,wewill investigateother lexicon reduction techniques thatbenefit from linguistic information. Acknowledgments: The teamwork of the “Arabic PrintedOCRSystem”projectwas funded and supported by theNSTIP strategic technologies program in theKingdomof SaudiArabia- project no. (11-INF-1997-03). Inaddition, theauthorsacknowledgewith thanksScienceandTechnologyUnit,KingAbdulazizUniversity for technical support. Author Contributions: Farhan M. A. Nashwan and Mohsen A. A. Rashwan conceived and designed the experiments;FarhanM.A.Nashwanperformedtheexperiments;SherifM.AbdouandMohsenA.A.Rashwan analyzedthedata;HassaninM.Al-Barhamtoshycontributedmaterialsandanalysis tools;FarhanM.A.Nashwan andSherifM.Abdouwrote thepaper;AbdullahM.Moussasubstantivelyrevisedthepaper. Conflictsof Interest:Theauthorsdeclarenoconflictof interest. Thefundingsponsorshadnorole in thedesign of the study; in the collection, analyses, or interpretationofdata; in thewritingof themanuscript, and in the decisiontopublish theresults. References 1. Khorsheed,M.;Al-Omari,H.RecognizingcursiveArabic text:Usingstatistical featuresandinterconnected mono-HMMs. In Proceedings of the 4th International Congress on Image and Signal Processing, Shanghai,China,15–17October2011;Volume5,pp.1540–1543. 2. Abd, M.A.; Al Rubeaai, S.; Paschos, G. Hybrid features for an Arabic word recognition system. Comput. Technol.Appl.20123, 685–691. 3. Amara,M.; Zidi, K.; Ghedira, K.An efficient andflexibleKnowledge-basedArabic text segmentation approach. Int. J.Comput. Sci. Inf. Secur. 2017,15, 25–35. 4. Radwan,M.A.;Khalil,M.I.;Abbas,H.M.Neuralnetworkspipeline forofflinemachineprintedArabicOCR. NeuralProcess. Lett. 2017, 1–19,doi:10.1007/s11063-017-9727-y. 5. El rube’, I.A.;ElSonni,M.T.;Saleh,S.S.PrintedArabicsub-wordrecognitionusingmoments.WorldAcad. Sci. Eng. Technol. 2010,4, 610–613. 6. MadhvanathS.;Govindaraju,V.Theroleofholisticparadigms inhandwrittenwordrecognition. IEEETrans. PatternAnal.Mach. Intell. 2001,23, 149–164. 69
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing