Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 79 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 79 - in Document Image Processing

Image of the Page - 79 -

Image of the Page - 79 - in Document Image Processing

Text of the Page - 79 -

J. Imaging 2018,4, 37 5.1.DataSets andEvaluationProtocols In this subsection, we discuss datasets and the experimental settings that we follow in the experiments.Ourdatasets,given inTable2, comprisescannedEnglishbooks fromadigital library collection. Wemanually createdground truth atword level for thequantitative evaluationof the methods. Thefirstcollection (D1)ofwords is fromabookwhich is reasonablyclean. Seconddataset (D2) is larger insizeandisusedtodemonstrate theperformance incaseofheterogeneousprint styles. Thirddataset (D3) isanoisybookandisusedtodemonstrate theutilityof theperformanceofour methodindegradedcollections.Wehavealsogiventheresultsover thepopularGeorgeWashington dataset. For theexperiments,weextractprofile features [11] foreachof thewordimages. In this,we divide the imagehorizontally into twoparts and the following features are computed: (i) vertical profile i.e thenumberof inkpixels ineachcolumn(ii) locationof lowermost inkpixel, (ii) location ofuppermost inkpixel and (iv)numberof ink tobackground transitions. Theprofile features are calculatedonbinarizedwordimagesobtainedusingtheOtsuthresholdingalgorithm.Thefeaturesare normalizedto [0,1], soas toavoiddominanceofanyspecific feature. Toevaluate thequantitativeperformance,multiple query imagesweregenerated. Thequery imagesareselectedsuchthat theyhavemultipleoccurrences in thedatabaseandaremostly functional wordsanddonot includethestopwords. Theperformance ismeasuredbymeanAveragePrecision (mAP),which is themeanof theareaunder theprecision-recall curve forall thequeries. Table2. Details of thedatasets considered in theexperiments. Thefirst collection (D1)ofwords is fromabookwhich is reasonably clean. The seconddataset (D2) is obtained from2books and is usedtodemonstrate theperformance incaseofheterogeneousprintstyles. Thethirddataset (D3) is anoisybook. Dataset Source Type #Images #Queries D1 1Book Clean 14,510 100 D2 2Books Clean 32,180 100 D3 1Book Noisy 4100 100 5.2. ExperimentalSettings For representingword images,weprefer afixed length sequence representationof thevisual content, i.e., eachword image is representedas afixed length sequenceofvertical strips. Aset of features f1,. . ., fL areextracted,where fi∈RM is the feature representationof the ithvertical strip and L is thenumberofvertical strips. This canbe consideredas a single featurevector F∈Rd of sized= LM. We implement thequeryspecificalignmentbasedsolutionasdiscussed inSection4. Forquery expansionbased solution,we identify thefivemost similar samples to thequeryusing approximatenearestneighborsearchandcompute theirmean. Eachdatasetcontainscertainwordswhicharemore frequent thanothers. Thenumberofsamples in the frequentwordclassesaremorecomparedto therareclasses. Theretrieval results for frequent queriesgivebetterperformancebecause thenumberof relevant samplesavailable in thedataset is greater. It isworthemphasizingthat for themethodproposedinthispaper(QSDTW), thedegradation in theperformance for rarequeries ismuchlesscomparedtoothermethods. 5.3. Results forFrequentQueries Table3compares theretrievalperformanceof thedirectqueryclassifierDQCwith thenearest neighborclassifierusingdifferentoptions fordistancemeasures. Theperformance is showninterms ofmeanaverageprecision (mAP)valueson threedatasets. For thenearest neighbor classifier,we experimentedwithfivedistancemeasures: naiveDTWdistance,FastapproximateDTWdistance [20], query specific DTW (QS DTW) distance, FastDTW [30] and Euclidean distance. We see that DTW 79
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing