Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 189 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 189 - in Document Image Processing

Image of the Page - 189 -

Image of the Page - 189 - in Document Image Processing

Text of the Page - 189 -

J. Imaging 2018,4, 32 identification [6]. AVideoOCRsystem is generally composedof four stages: detection, tracking, extractionandrecognition. The twofirst steps consist in locating text regions invideo framesand generating theboundingboxesof text linesasanoutput. Textextractionaimsatextracting textpixels andremovingbackgroundones. Therecognitiontaskconverts imageregions into text strings. In this work,wefocusespeciallyonthedetectionandrecognitionsteps. Figure1.ExampleofanArabicvideoframeincludingsceneandartificial texts (a).Decompositionof anArabicwordintocharacters (b). Compared to scanned documents, text detection and recognition in video frames is more challenging. Themajorchallengesare: • Text patterns variability: unknown font-size and font-family, different colors and alignment (even in thesameTVchannel). • Backgroundcomplexity: text-likeobjects invideoframes, suchas fences,bricksandsigns, canbe confusedwith textcharacters. • Videoquality: acquisitionconditions, compressionartifactsandlowresolution. All thesechallengesmaygiverise to failures invideotextdetection. Thepresentstudyfocuses on theArabic videoOCRproblem. This introducesmanyadditional challenges related toArabic script [7]. ComparedtoLatin, theArabic texthasspecial characteristics suchaspresenceofdiacritics, non-uniforminter/intra-worddistanceandcursivenessof thescript, i.e., charactersmayhaveupto fourshapesdependingontheirposition in theword(forexamples, seeFigure1b). Several techniques have been proposed in the conventional field ofArabicOCR in scanned documents [7–10]. However, fewattemptshavebeenmadeon thedevelopment ofdetection and recognition systems for overlaid text inArabicnewsvideo [11–13]. These systemswere testedon privatedatasetswithdifferent evaluationprotocols andmetrics thatmakedirect comparisonand objective benchmarking rather impractical. For instance, in [11], the proposed text detectorwas evaluatedonaprivate set of 150video images. In [13], Yousfiet al. evaluated their textdetection systemontwoprivate test setsof164and201videoframes. Therefore, theavailabilityofanannotated andpublicdataset isofkey importance for theArabicvideotextanalysiscommunity. In this paper, we present AcTiV 2.0 as an open Arabic-Text-in-Video dataset dedicated to benchmarkingandcomparisonofsystemsforArabictextdetection,trackingandrecognition.AcTiV2.0 isan importantextensionof theonepublishedinICDAR2015[14]. It includes189videoclipswith anaverage lengthof10minpersequence foraglobaldurationofabout31h. Thesevideosequences havebeencollected fromfourdifferentArabicnewschannelsduring theperiodbetweenOctober 2013andMarch2016. In thepresentwork, threevideoresolutionswerechosen:HD(HighDefinition, 1920×1080),SD(StandardDefinition,720×576)andSD(480×360). The latter resolutionconcerns videoclips thathavebeendownloadedfromtheofficialYouTubechannelofTunisiaNat1TV. Thepaper is organizedas follows: In Section 2,wepresent relatedworkondatasets for text detection/recognitionproblems. Then,wepresent in Section 3 theAcTiV 2.0 dataset in terms of features, statisticsandannotations.Wedetail theevaluationprotocols inSection4andpresent the experimental results inSection5. InSection6,wedrawtheconclusionsanddiscuss futurework. 189
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing