Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 23 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 23 - in Document Image Processing

Image of the Page - 23 -

Image of the Page - 23 - in Document Image Processing

Text of the Page - 23 -

J. Imaging 2018,4, 27 14. Pun[26] 15. Shanbhag[27] 16. Triangle [28] 17. Wu-Lu[29] 18. Yean-Chang-Chang[30] 19. Intermodes [31] 20. Minimum(variationof [31]) 21. Ergina-Local [32] 22. Sauvola [33] 23. Niblack[34] Aground-truth image foreach“real”worldone isneeded toallowaquantitativeassessment of the quality of the final binary image. Only theDIBCOdataset [10] had ground-truth images available. Thismakes theassessment taskof real-world imagesextremelydifficult [35].All caremust be takentoguarantee the fairnessof theprocess. Theground-truth images for theotherdatasetswere generatedbyapplyingthe23algorithmsaboveandthebilateralalgorithmtoall the test images in the Nabuco[7]andLiveMemory[36]datasets.Visual inspectionwasmadetochoosethebestbinaryimage inablindprocess, aprocess inwhich thepeoplewhoselected thebest imagedidnotknowwhich algorithmgenerated it. To increase thedegreeof fairness and thenumberoffilteringpossibilities, the threecomponent imagesproducedbytheDecisionMakingblockwereallanalyzed. Thebinary imageschosenusing themethodologyabovewent throughsalt-and-pepperfilteringandwereusedas ground-truth imagefor theassessmentbelow.All theprocessingtimefigurespresented in thispaper are fromIntel i7-4510U@2.00GHzx2,8GBRAM,runningLinuxMint18.264-bit.Allalgorithmswere codedinJava,possiblybytheirauthors. 3.1. TheNabucoDataset The Nabuco bequest encompasses about 6500 letters and postcards written and typed by JoaquimNabuco [7], totalingabout30,000pages. Suchdocumentsareofgreat interest towhoever studiesthehistoryoftheAmericas,asNabucowasoneofthekeyfiguresinthefreedomofblackslaves, andwas thefirst BrazilianAmbassador to theU.S.A. Thedocuments ofNabucoweredigitalized by the second author of this paper and the historians of the JoaquimNabuco Foundation using a table scanner in 200dpi resolution in true color (24 bits per pixel), back in 1992 to 1994.Due to seriousstorage limitations then, imagesweresavedin the jpegformatwith1%loss. Thehistorians in theproject concludedthat150dpiresolutionwouldsuffice torepresentall thegraphicalelements in the documents, but choice of the 200-dpi resolutionwasmade to be compatiblewith the FAX deviceswidelyusedthen.About200of thedocuments in theNabucobequestexhibitedback-to-front interference. The15document imagesused in thisdatasetwerechosenforbeingrepresentativeof the diversityofdocuments insuchauniverse. Table 1presents thequantitative results obtained for all thedocuments in thisdataset. P(f/f) stands for the ratio between thenumber of foregroundpixels in the original imagemappedonto blackpixelsandthenumberofblackpixels in theground-truth image. Similarly,P(b/b) isproportion between thenumber of backgroundpixels in theoriginal imagemappedontowhitepixels of the binary imageandthenumberofwhitepixels in theground-truth image. Thefigures forP(b/b)and P(f/f)are followedby“±”andthevalueof thestandarddeviation. The timecorresponds to themean processingtimeelapsedbythealgorithmtoprocess the images inthisdataset. Theresultswereranked inP(b/b)decreasingorder. Theresultspresented inTable1showsthebilateralfilter in thirdplace for thisdataset in termsof imagequality,however thestandarddeviation ismuchlower thanthe twofirst. That implies that its quality ismorestable for thevariousdocument images in thisdataset. Figure5presents thedocument 23
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing