Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Seite - 193 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 193 - in Document Image Processing

Bild der Seite - 193 -

Bild der Seite - 193 - in Document Image Processing

Text der Seite - 193 -

J. Imaging 2018,4, 32 ThenewdatasetAcTiV2.0 includes189videosequences, 4063key frames, 10,415 text images and three video-stream resolutions, i.e., the new one is SD (480× 360). A brief comparison in termsofcontentbetweenthe initialandnewversionof theproposeddataset ispresentedinTable2. The architecture of the newdataset is completely different from the old one. In addition to the videosandtheirannotationXMLfiles,AcTiV2.0 includes twoappropriatedatasets fordetectionand recognitiontasks, (seeFigure4). Table2.StatisticsofAcTiV1.0andAcTiV2.0. #Resolution #Videos #Frames #CroppedImages AcTiV1.0 2 80 1843 - AcTiV2.0 3 189 4063 10,415 R D 1920 x 1080 AlJazeeraHD 909 France24 874 RussiaToday 882 TunisiaNat1 1099 TunisiaNat1+ 299 AlJazeeraHD 2367 9958 57189 France24 2276 7084 40520 RussiaToday 2633 16543 96990 TunisiaNat1 2411 10998 64493 TunisiaNat1+ 631 2635 15371 #Lines #Words #Characters # Frames TV Channel Resolution 720 x 576 460 x 380 Resolution 1920 x 1080 720 x 576 460 x 380 TV Channel189 Figure4.ArchitectureofAcTiV2.0andstatisticsof thedetection(D)andrecognition(R)datasets. • AcTiV-D representsadatasetofnon-redundant framesusedtobuildandevaluatemethodsfor detecting text regions inHD/SDframes.Atotalof4063 frameshavebeenhand-selectedwith aparticular attention to achieve a highdiversity indepicted text regions. Figure 5provides examples fromAcTiV-Dfor typicalproblemsinvideotextdetection. Totest thesystems’ability to locate textsunderdifferentsituations, theproposeddataset includessomeframeswhichcontain thesametextregionbutwithdifferentbackgroundsandsomeotherswithoutanytextcomponent. • AcTiV-R is a dataset of textline images that canbeutilized to build and evaluateArabic text recognition systems. Different fonts (more than 6), sizes, backgrounds, colors, contrasts and occlusions are represented in thedataset. Figure 6 illustrates typical examples fromAcTiV-R. Thecollected text images coverabroadrangeof characteristics thatdistinguishvideo frames from scanned documents. AcTiV-R consists of 10,415 textline images, 44,583 words and 259,192characters. Tohaveaneasilyaccessible representationofArabic text, it is transformed intoasetofLatin labelswithasuffix that refers to the letter’sposition in theword,_B:Begin,_M: Middle;_E:End;and_I: Isolate.Anexample is showninFigure1.Duringtheannotationprocess, wehaveconsidered164Arabiccharacter forms: – 125 letters, i.e., taking intoaccount this“positioning”variability; – 15additionalcharacters, i.e., combinedwith thediacritic sign“Chadda”; – 10digits; and – 14punctuationmarks includingthewhite space. Thedifferentcharacter labelscanbeobservedinTable3. Thesametablegives foreachcharacter its frequency in thedataset. Moredetailsabout thestatisticsof thedetectionandrecognitiondatasetsare inFigure4. 193
zurück zum  Buch Document Image Processing"
Document Image Processing
Titel
Document Image Processing
Autoren
Ergina Kavallieratou
Laurence Likforman-Sulem
Herausgeber
MDPI
Ort
Basel
Datum
2018
Sprache
deutsch
Lizenz
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Abmessungen
17.0 x 24.4 cm
Seiten
216
Schlagwörter
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie
Informatik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing