Seite - 79 - in Document Image Processing

Bild der Seite - 79 -

Text der Seite - 79 -

J. Imaging 2018,4, 37 5.1.DataSets andEvaluationProtocols In this subsection, we discuss datasets and the experimental settings that we follow in the experiments.Ourdatasets,given inTable2, comprisescannedEnglishbooks fromadigital library collection. Wemanually createdground truth atword level for thequantitative evaluationof the methods. Theﬁrstcollection (D1)ofwords is fromabookwhich is reasonablyclean. Seconddataset (D2) is larger insizeandisusedtodemonstrate theperformance incaseofheterogeneousprint styles. Thirddataset (D3) isanoisybookandisusedtodemonstrate theutilityof theperformanceofour methodindegradedcollections.Wehavealsogiventheresultsover thepopularGeorgeWashington dataset. For theexperiments,weextractproﬁle features [11] foreachof thewordimages. In this,we divide the imagehorizontally into twoparts and the following features are computed: (i) vertical proﬁle i.e thenumberof inkpixels ineachcolumn(ii) locationof lowermost inkpixel, (ii) location ofuppermost inkpixel and (iv)numberof ink tobackground transitions. Theproﬁle features are calculatedonbinarizedwordimagesobtainedusingtheOtsuthresholdingalgorithm.Thefeaturesare normalizedto [0,1], soas toavoiddominanceofanyspeciﬁc feature. Toevaluate thequantitativeperformance,multiple query imagesweregenerated. Thequery imagesareselectedsuchthat theyhavemultipleoccurrences in thedatabaseandaremostly functional wordsanddonot includethestopwords. Theperformance ismeasuredbymeanAveragePrecision (mAP),which is themeanof theareaunder theprecision-recall curve forall thequeries. Table2. Details of thedatasets considered in theexperiments. Theﬁrst collection (D1)ofwords is fromabookwhich is reasonably clean. The seconddataset (D2) is obtained from2books and is usedtodemonstrate theperformance incaseofheterogeneousprintstyles. Thethirddataset (D3) is anoisybook. Dataset Source Type #Images #Queries D1 1Book Clean 14,510 100 D2 2Books Clean 32,180 100 D3 1Book Noisy 4100 100 5.2. ExperimentalSettings For representingword images,weprefer aﬁxed length sequence representationof thevisual content, i.e., eachword image is representedas aﬁxed length sequenceofvertical strips. Aset of features f1,. . ., fL areextracted,where fi∈RM is the feature representationof the ithvertical strip and L is thenumberofvertical strips. This canbe consideredas a single featurevector F∈Rd of sized= LM. We implement thequeryspeciﬁcalignmentbasedsolutionasdiscussed inSection4. Forquery expansionbased solution,we identify theﬁvemost similar samples to thequeryusing approximatenearestneighborsearchandcompute theirmean. Eachdatasetcontainscertainwordswhicharemore frequent thanothers. Thenumberofsamples in the frequentwordclassesaremorecomparedto therareclasses. Theretrieval results for frequent queriesgivebetterperformancebecause thenumberof relevant samplesavailable in thedataset is greater. It isworthemphasizingthat for themethodproposedinthispaper(QSDTW), thedegradation in theperformance for rarequeries ismuchlesscomparedtoothermethods. 5.3. Results forFrequentQueries Table3compares theretrievalperformanceof thedirectqueryclassiﬁerDQCwith thenearest neighborclassiﬁerusingdifferentoptions fordistancemeasures. Theperformance is showninterms ofmeanaverageprecision (mAP)valueson threedatasets. For thenearest neighbor classiﬁer,we experimentedwithﬁvedistancemeasures: naiveDTWdistance,FastapproximateDTWdistance [20], query speciﬁc DTW (QS DTW) distance, FastDTW [30] and Euclidean distance. We see that DTW 79

zurück zum Buch Document Image Processing"

Document Image Processing

Titel: Document Image Processing
Autoren: Ergina Kavallieratou; Laurence Likforman-Sulem
Herausgeber: MDPI
Ort: Basel
Datum: 2018
Sprache: deutsch
Lizenz: CC BY-NC-ND 4.0
ISBN: 978-3-03897-106-1
Abmessungen: 17.0 x 24.4 cm
Seiten: 216
Schlagwörter: document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie: Informatik