Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 26 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 26 - in Document Image Processing

Image of the Page - 26 -

Image of the Page - 26 - in Document Image Processing

Text of the Page - 26 -

J. Imaging 2018,4, 27 4.Conclusions Historicaldocumentsare farmoredifficult tobinarizeasseveral factorssuchaspaper texture, aging, thickness, translucidity, permeability, the kind of ink, its fluidity, color, aging, etc. allmay influence theperformanceof thealgorithms. Besidesall that,manyhistoricaldocumentswerewritten orprintedonbothsidesof translucentpaper,givingrise to theback-to-front interference. Thispaperpresentsanewbinarizationschemebasedonthebilateralfilter.Experimentsperformed in threedatasetsof“realworld”historicaldocumentswith twenty-threeotherbinarizationalgorithms. Imagequalityandprocessingtimefigureswereprovided,at least for the top10algorithmsassessed. Theresultsobtainedshowedthat theproposedalgorithmyieldsgoodqualitymonochromatic images thatmaycompensate itshighcomputationalcost. Thispaperprovidesevidence thatnobinarization algorithmisan“all-kind-of-document”winner,astheperformanceofthealgorithmsvarieddepending of thespecific featuresofeachdocument. Amuch larger test setof syntheticabout250,000 images is currently under development, such a test set will allowmuch better training of the Decision MakingandImageClassifierblocksof thebilateralalgorithmpresented. Theauthorsarecurrently attemptingto integrate theDecisionMakingandImageClassifierblocks insuchawaytoanticipate the choice of thebest component image. Thiswouldhighly improve the timeperformanceof the proposedalgorithm. Figure7.Twodocuments fromDIBCOdataset: (left-top)original image(left-bottom)binary image obtainedusingthebilateralfilterbest result (P(f/f)=97.05,P(b/b)=99.88); (right-top)original image. (right-bottom) theworstbinarizationresults for thebilateralfilter (P(f/f)=25.93,P(b/b)=99.99). Theauthorsof thispaperarepromotingaparamountresearcheffort toassess the largestpossible numberofbinarizationalgorithms for scanneddocumentsusingover5.4millionsynthetic images in theDIB-Document ImageBinarizationplatform.Animagematcher,amoregeneralandcomplex versionof theDecisionMakingblock, isalsobeingdevelopedandtrainedwiththat largesetof images, inorder towhenever fedwitharealworld image, tobeable tomatchwith themostsimilarsynthetic one. Once thatmatch ismade, themost suitable binarization algorithmsare immediatelyknown. If thispaperwereaccepted,all the test imagesandalgorithmswillbe includedintheDIBplatform. Thepreliminaryversionof theDIB-Document ImageBinarizationplatformandwebsite ispublicly availableathttps://dib.cin.ufpe.br/. Acknowledgments: Theauthors of this paper are grateful for the refereeswhose commentsmuchhelped in improvingthecurrentversionofthispaperandtothoseresearcherswhomadethecodeoftheiralgorithmspublicly availablefortestingandperformanceanalysisandtotheDIBCOteamfrommakingtheir imagespubliclyavailable. Theauthorsalsoacknowledgethepartialfinancial supportof toCNPqandCAPES—BrazilianGovernment. 26
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing