Seite - 199 - in Document Image Processing
Bild der Seite - 199 -
Text der Seite - 199 -
J. Imaging 2018,4, 32
WRR= #words_correctly_recognized
#words (5)
LRR= #lines_correctly_recognized
#lines (6)
Figure 11 showsanexample explaining the impact onCRRandWRRmetrics resulting from
substitutionanddeletionerrors.
Figure11.ExampleofCRRandWRRcomputationbasedonoutputerrors.
It isworthnotingthat theproposedprotocolshelpusunderstandinghowgeneric is thesystem,
i.e., if a systemperformswell forProtocols7and9(independentlyof theTVchannel). For instance, in
theAcTiVCompcontest,weobservedthat someparticipatingsystemsperformwell inHDresolution
only, someothers arequite generic (i.e., good inbothSDandHDresolutions). Other systemsare
incompatiblewithaspecificresolution.Variousexamplesofusingtheseevaluationprotocolswillbe
presented in thenextsection.
5.ApplicationofAcTiVDatasets
Theproposeddatasetshavebeenusedtobuildandevaluate twosystemsforArabicvideotext
detectionandrecognition. The textdetector isbasedonahybridapproachcomposedofCC-based
heuristicphaseandamachine learningverificationprocedure. Therecognizersystemconsistsofa
Multi-DimensionalRNNs(MDRNNs)[49]coupledwithaConnectionistTemporalClassification(CTC)
layer [50].
5.1. LADIDetector
The LADI text detection system is based in our previous work [14,46], with new added
enhancementsconsideringthecolorconsistencyofnear text regions.Ourtextdetectorrepresentsa
hybridapproachconsistingof twostages: aCC-basedheuristic algorithmandamachine learning
classification. Themain ideaof thissystemis tocombinetwotechniques: anadaptedversionof the
SWTalgorithmandaconvolutionalauto-encoder (CAE).AsshowninFigure12, thefirst stagestarts
withapreprocessingsteptodecreasenoiseandfinedetail. It thencomputes theedgemapandX&Y
gradients fromtheprocessedframeusingCannyandSobeloperators, respectively.After that, theSWT
operator isperformedas follow.
- Gradientdirectiondp iscalculated,ateachedgepixelp,which is roughlyperpendicular to the
strokeorientation.
- Asearchray r= p+n∗dp (n>0) starting fromanedgepixelpalongthegradientdirectiondp
is shotuntilwefindanotheredgepixelq. If these twoedgepixelshavenearlyoppositegradient
orientations, theray isconsideredvalid.Allpixels inside this rayare labeledbythe length |p−q|.
Thenext step is togroupadjacentpixels in theresultingSWTimage intoCCs. This isdoneby
applying aflood-fill algorithmbasedon consistency in strokewidth and color. TheCCs are then
filteredusingasetof simpleheuristic rulesconcerningtheCCsize,position,aspect-ratioandcolor
uniformity. TheremainingCCsare iterativelymergedintowordsandtextlinesbasedonaproposed
199
zurück zum
Buch Document Image Processing"
Document Image Processing
- Titel
- Document Image Processing
- Autoren
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2018
- Sprache
- deutsch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 216
- Schlagwörter
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Kategorie
- Informatik