Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 192 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 192 - in Document Image Processing

Image of the Page - 192 -

Image of the Page - 192 - in Document Image Processing

Text of the Page - 192 -

J. Imaging 2018,4, 32 training/testsamplesandbestachievedresult.Asdepictedbythis table,publiclyavailabledatasets for ArabicVideoOCRsystemsare limitedtooneworkfor therecognitiontaskandareevennon-existent fordetectionandtrackingproblems. Yousfietal. [44]put forwardadataset for superimposed text recognition, calledAlif. Thedatasetwascomposedof6532staticcroppedtext imagesextractedfrom diverseArabicTVchannelsandwithabout12%extractedfromwebsources. Thisdatasetofferedonly one imageresolution. Table1.Most importantexistingdatasets for textprocessing invideosandscene images. “D”,“S”and “R”respectivelydenote“Detection”,“Segmentation”and“Recognition”. Dataset (Year) Category Source Task #ofImages (Train/Test) #ofText (Train/Test) Script Best Scores ICDAR’03 [18] (2003) Scene text Camera D/R 509 (258/251) 2276 (1110/1156) English 93.1%(R) KAIST[31] (2010) Scene text Camera, mobilephone D/S 3000 >5000 English, Korean 88%(S) SVT[28] (2010) Scene text Google StreetView D/S/R 350 (100/250) 904 (257/647) English 80.8%(R) 90%(S) NEOCR[33] (2011) Scene text Camera D/R 659 5238 Eight languages ICDAR’11 [22] (2011) Scene text Camera D/R 485 1564 English 82%(D) MSRA-TD500 [26] (2012) scene text Camera D 500 (300/200) _ English, Chinese 75% ICDAR’13 [24] (2013) Scene text Artificial text Videoscene Camera Web Camera D/S/R D/S/R D/T/R 229/233 410/141 28videos 848/1095 3564/1439 _ Spanish, French, English ALIF[44] (2015) Artificial text Videoframes R 6532 (4152/2199) Arabic 55.03% COCO-Text [34] (2016) Scene text MSCOCO dataset D/R 63,686 (43.6k/10k) 173,000 English 67.16% (D) Total-Text [36] (2017) Curvedscene text web D/R 1555 (1255/300) 9330 (words) English 3. ProposedDatasets In this section, we describe the AcTiV 2.0 dataset in terms of characteristics, statistics and annotationguidelines. 3.1.DataCharacteristics andStatistics Asmentioned in the introduction,AcTiV1.0 (http://tc11.cvc.uab.es/datasets/AcTiV_1)was presented in the ICDAR’15conference [14]as thefirstpubliclyaccessibleannotateddatasetdesigned toassess theperformanceofdifferentArabicVideoOCRsystems. Thisdatabase iscurrentlyusedby several researchgroupsaroundtheworld. Itwaspartiallyusedasabenchmark in thefirsteditionof the“AcTiVComp”contest inconjunctionwith the ICPR’16conference [45]. The twomainchallenges addressed by this dataset are text pattern variability andpresence of complex backgroundswith various text-likeobjects.AcTiV1.0consistsof80videoclipsrecordedfromfourdifferentArabicnews channels: TunisiaNat1,France24,RussiaTodayandAljazeeraHD.AcTiV1.0 iscomposedofvideoclips andtheircorrespondingXMLfiles (detailed inSection3.2).Weselectedfromthesevideoclips1843 framesdedicatedto thedetectiontask. In [14,46], thefirst resultsusingAcTiV1.0werepresented. Basedontheobtainedresultsunderdifferentevaluationprotocolsandconsidering theAcTiV1.0 users’ feed-backs, itwasnecessarytoextendthecontent intermsofvideoclipsandresolutionsoffering more trainingsamples, especially fordeeplearning-basedmethods. 192
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing