Page - 199 - in Document Image Processing
Image of the Page - 199 -
Text of the Page - 199 -
J. Imaging 2018,4, 32
WRR= #words_correctly_recognized
#words (5)
LRR= #lines_correctly_recognized
#lines (6)
Figure 11 showsanexample explaining the impact onCRRandWRRmetrics resulting from
substitutionanddeletionerrors.
Figure11.ExampleofCRRandWRRcomputationbasedonoutputerrors.
It isworthnotingthat theproposedprotocolshelpusunderstandinghowgeneric is thesystem,
i.e., if a systemperformswell forProtocols7and9(independentlyof theTVchannel). For instance, in
theAcTiVCompcontest,weobservedthat someparticipatingsystemsperformwell inHDresolution
only, someothers arequite generic (i.e., good inbothSDandHDresolutions). Other systemsare
incompatiblewithaspecificresolution.Variousexamplesofusingtheseevaluationprotocolswillbe
presented in thenextsection.
5.ApplicationofAcTiVDatasets
Theproposeddatasetshavebeenusedtobuildandevaluate twosystemsforArabicvideotext
detectionandrecognition. The textdetector isbasedonahybridapproachcomposedofCC-based
heuristicphaseandamachine learningverificationprocedure. Therecognizersystemconsistsofa
Multi-DimensionalRNNs(MDRNNs)[49]coupledwithaConnectionistTemporalClassification(CTC)
layer [50].
5.1. LADIDetector
The LADI text detection system is based in our previous work [14,46], with new added
enhancementsconsideringthecolorconsistencyofnear text regions.Ourtextdetectorrepresentsa
hybridapproachconsistingof twostages: aCC-basedheuristic algorithmandamachine learning
classification. Themain ideaof thissystemis tocombinetwotechniques: anadaptedversionof the
SWTalgorithmandaconvolutionalauto-encoder (CAE).AsshowninFigure12, thefirst stagestarts
withapreprocessingsteptodecreasenoiseandfinedetail. It thencomputes theedgemapandX&Y
gradients fromtheprocessedframeusingCannyandSobeloperators, respectively.After that, theSWT
operator isperformedas follow.
- Gradientdirectiondp iscalculated,ateachedgepixelp,which is roughlyperpendicular to the
strokeorientation.
- Asearchray r= p+n∗dp (n>0) starting fromanedgepixelpalongthegradientdirectiondp
is shotuntilwefindanotheredgepixelq. If these twoedgepixelshavenearlyoppositegradient
orientations, theray isconsideredvalid.Allpixels inside this rayare labeledbythe length |p−q|.
Thenext step is togroupadjacentpixels in theresultingSWTimage intoCCs. This isdoneby
applying aflood-fill algorithmbasedon consistency in strokewidth and color. TheCCs are then
filteredusingasetof simpleheuristic rulesconcerningtheCCsize,position,aspect-ratioandcolor
uniformity. TheremainingCCsare iterativelymergedintowordsandtextlinesbasedonaproposed
199
back to the
book Document Image Processing"
Document Image Processing
- Title
- Document Image Processing
- Authors
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Editor
- MDPI
- Location
- Basel
- Date
- 2018
- Language
- German
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Size
- 17.0 x 24.4 cm
- Pages
- 216
- Keywords
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Category
- Informatik