Page - 197 - in Document Image Processing

Image of the Page - 197 -

Text of the Page - 197 -

J. Imaging 2018,4, 32 4. EvaluationProtocolsandMetrics Asmentionedbefore, theproposedAcTiVdatasetsaremainlydedicatedto trainandevaluate the existingsystemsforArabic textdetectionandrecognition innewsvideo. Toobjectivelycompareand measure theperformanceof thesesystems,weproposedtopartitioneachof theAcTiV-DandAcTiV-R datasets into train, testandclosedtest subsets takingadvantageof thevariability indatacontent. It is tonote that the latter subset containsprivatedata (quite similar to the test set) that areused in the contextofcompetitionsonly. Inaddition,wesuggestedasetofevaluationprotocolssuchthatdifferent techniquescouldbedirectlycompared. Inotherwords, theproposedprotocolsallowus toclosely analyze thesystembehavior towardsagivenresolution(HD/SD)and/orquality (DBS/Web). 4.1.DetectionProtocols andMetrics Table4depicts thedetectionprotocols. • Protocol 1aims tomeasure theperformanceof single-framebasedmethods todetect texts in HDframes. • Protocol4 is similar toProtocol1,differingonlybythechannel resolution.AllSD(720×576) channels inourdatabase canbe targetedby thisprotocolwhich is split in four sub-protocols: threechannel-dependent (Protocols4.1,4.2and4.3)andonechannel-free (Protocol4.4). • Protocol 4bis is dedicated to the newadded resolution (480× 360) for the TunisiaNat1 TV channel. Themain idea of this protocol is to train a given systemwith SD (720× 576) data i.e.,Protocol4.3andtest itwithdifferentdataresolutionandquality. • Protocol 7 is the generic version of the previous protocolswhere text detection is evaluated regardlessofdataquality. Table4.DetectionEvaluationProtocols. Training-Set1 Training-Set2 Test-Set1 Test-Set2 Closed-SetProtocol TVChannel #Frames #Frames #Frames #Frames #Frames 1 AlJazeeraHD 337 610 87 196 103 France24 331 600 80 170 104 RussiaToday 323 611 79 171 100 TunisiaNat1 492 788 116 205 1064 AllSD 1146 1999 275 546 310 4bis TunisiaNat1+ - - - 149 150 7 All 1483 2609 362 891 563 Metrics:Theperformanceofa textdetector isevaluatedbasedonprecision, recall andF-measure metrics thataredeﬁnedas: Precision= ∑ |D| i=1matchD(Di) |D| (1) Recall= ∑ |G| i=1matchG(Gi) |G| (2) Fmeasure=2∗ Precision∗Recall Precision+Recall (3) whereD is the listofdetectedrectangles,G is the listofground-truthrectanglesandmatchD/matchG are thematchingfunctions, respectively. Thesemeasuresarecalculatedusingourevaluationtool [48] which takes into account all types of matching cases between G bounding boxes and D ones, i.e., one-to-one, one-to-manyandmany-to-onematching. In thematchingprocedure, twoquality constraints,namely, tp and tr areutilized. tp∈ [0,1] is theconstraintonareaprecisionand tr∈ [0,1] is 197

back to the book Document Image Processing"

Document Image Processing

Title: Document Image Processing
Authors: Ergina Kavallieratou; Laurence Likforman-Sulem
Editor: MDPI
Location: Basel
Date: 2018
Language: German
License: CC BY-NC-ND 4.0
ISBN: 978-3-03897-106-1
Size: 17.0 x 24.4 cm
Pages: 216
Keywords: document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category: Informatik