Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 193 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 193 - in Document Image Processing

Image of the Page - 193 -

Image of the Page - 193 - in Document Image Processing

Text of the Page - 193 -

J. Imaging 2018,4, 32 ThenewdatasetAcTiV2.0 includes189videosequences, 4063key frames, 10,415 text images and three video-stream resolutions, i.e., the new one is SD (480× 360). A brief comparison in termsofcontentbetweenthe initialandnewversionof theproposeddataset ispresentedinTable2. The architecture of the newdataset is completely different from the old one. In addition to the videosandtheirannotationXMLfiles,AcTiV2.0 includes twoappropriatedatasets fordetectionand recognitiontasks, (seeFigure4). Table2.StatisticsofAcTiV1.0andAcTiV2.0. #Resolution #Videos #Frames #CroppedImages AcTiV1.0 2 80 1843 - AcTiV2.0 3 189 4063 10,415 R D 1920 x 1080 AlJazeeraHD 909 France24 874 RussiaToday 882 TunisiaNat1 1099 TunisiaNat1+ 299 AlJazeeraHD 2367 9958 57189 France24 2276 7084 40520 RussiaToday 2633 16543 96990 TunisiaNat1 2411 10998 64493 TunisiaNat1+ 631 2635 15371 #Lines #Words #Characters # Frames TV Channel Resolution 720 x 576 460 x 380 Resolution 1920 x 1080 720 x 576 460 x 380 TV Channel189 Figure4.ArchitectureofAcTiV2.0andstatisticsof thedetection(D)andrecognition(R)datasets. • AcTiV-D representsadatasetofnon-redundant framesusedtobuildandevaluatemethodsfor detecting text regions inHD/SDframes.Atotalof4063 frameshavebeenhand-selectedwith aparticular attention to achieve a highdiversity indepicted text regions. Figure 5provides examples fromAcTiV-Dfor typicalproblemsinvideotextdetection. Totest thesystems’ability to locate textsunderdifferentsituations, theproposeddataset includessomeframeswhichcontain thesametextregionbutwithdifferentbackgroundsandsomeotherswithoutanytextcomponent. • AcTiV-R is a dataset of textline images that canbeutilized to build and evaluateArabic text recognition systems. Different fonts (more than 6), sizes, backgrounds, colors, contrasts and occlusions are represented in thedataset. Figure 6 illustrates typical examples fromAcTiV-R. Thecollected text images coverabroadrangeof characteristics thatdistinguishvideo frames from scanned documents. AcTiV-R consists of 10,415 textline images, 44,583 words and 259,192characters. Tohaveaneasilyaccessible representationofArabic text, it is transformed intoasetofLatin labelswithasuffix that refers to the letter’sposition in theword,_B:Begin,_M: Middle;_E:End;and_I: Isolate.Anexample is showninFigure1.Duringtheannotationprocess, wehaveconsidered164Arabiccharacter forms: – 125 letters, i.e., taking intoaccount this“positioning”variability; – 15additionalcharacters, i.e., combinedwith thediacritic sign“Chadda”; – 10digits; and – 14punctuationmarks includingthewhite space. Thedifferentcharacter labelscanbeobservedinTable3. Thesametablegives foreachcharacter its frequency in thedataset. Moredetailsabout thestatisticsof thedetectionandrecognitiondatasetsare inFigure4. 193
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing