Seite - 23 - in Document Image Processing
Bild der Seite - 23 -
Text der Seite - 23 -
J. Imaging 2018,4, 27
14. Pun[26]
15. Shanbhag[27]
16. Triangle [28]
17. Wu-Lu[29]
18. Yean-Chang-Chang[30]
19. Intermodes [31]
20. Minimum(variationof [31])
21. Ergina-Local [32]
22. Sauvola [33]
23. Niblack[34]
Aground-truth image foreach“real”worldone isneeded toallowaquantitativeassessment
of the quality of the final binary image. Only theDIBCOdataset [10] had ground-truth images
available. Thismakes theassessment taskof real-world imagesextremelydifficult [35].All caremust
be takentoguarantee the fairnessof theprocess. Theground-truth images for theotherdatasetswere
generatedbyapplyingthe23algorithmsaboveandthebilateralalgorithmtoall the test images in the
Nabuco[7]andLiveMemory[36]datasets.Visual inspectionwasmadetochoosethebestbinaryimage
inablindprocess, aprocess inwhich thepeoplewhoselected thebest imagedidnotknowwhich
algorithmgenerated it. To increase thedegreeof fairness and thenumberoffilteringpossibilities,
the threecomponent imagesproducedbytheDecisionMakingblockwereallanalyzed. Thebinary
imageschosenusing themethodologyabovewent throughsalt-and-pepperfilteringandwereusedas
ground-truth imagefor theassessmentbelow.All theprocessingtimefigurespresented in thispaper
are fromIntel i7-4510U@2.00GHzx2,8GBRAM,runningLinuxMint18.264-bit.Allalgorithmswere
codedinJava,possiblybytheirauthors.
3.1. TheNabucoDataset
The Nabuco bequest encompasses about 6500 letters and postcards written and typed by
JoaquimNabuco [7], totalingabout30,000pages. Suchdocumentsareofgreat interest towhoever
studiesthehistoryoftheAmericas,asNabucowasoneofthekeyfiguresinthefreedomofblackslaves,
andwas thefirst BrazilianAmbassador to theU.S.A. Thedocuments ofNabucoweredigitalized
by the second author of this paper and the historians of the JoaquimNabuco Foundation using
a table scanner in 200dpi resolution in true color (24 bits per pixel), back in 1992 to 1994.Due to
seriousstorage limitations then, imagesweresavedin the jpegformatwith1%loss. Thehistorians
in theproject concludedthat150dpiresolutionwouldsuffice torepresentall thegraphicalelements
in the documents, but choice of the 200-dpi resolutionwasmade to be compatiblewith the FAX
deviceswidelyusedthen.About200of thedocuments in theNabucobequestexhibitedback-to-front
interference. The15document imagesused in thisdatasetwerechosenforbeingrepresentativeof the
diversityofdocuments insuchauniverse.
Table 1presents thequantitative results obtained for all thedocuments in thisdataset. P(f/f)
stands for the ratio between thenumber of foregroundpixels in the original imagemappedonto
blackpixelsandthenumberofblackpixels in theground-truth image. Similarly,P(b/b) isproportion
between thenumber of backgroundpixels in theoriginal imagemappedontowhitepixels of the
binary imageandthenumberofwhitepixels in theground-truth image. Thefigures forP(b/b)and
P(f/f)are followedby“±”andthevalueof thestandarddeviation. The timecorresponds to themean
processingtimeelapsedbythealgorithmtoprocess the images inthisdataset. Theresultswereranked
inP(b/b)decreasingorder.
Theresultspresented inTable1showsthebilateralfilter in thirdplace for thisdataset in termsof
imagequality,however thestandarddeviation ismuchlower thanthe twofirst. That implies that its
quality ismorestable for thevariousdocument images in thisdataset. Figure5presents thedocument
23
zurück zum
Buch Document Image Processing"
Document Image Processing
- Titel
- Document Image Processing
- Autoren
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2018
- Sprache
- deutsch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 216
- Schlagwörter
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Kategorie
- Informatik