Seite - 26 - in Document Image Processing
Bild der Seite - 26 -
Text der Seite - 26 -
J. Imaging 2018,4, 27
4.Conclusions
Historicaldocumentsare farmoredifficult tobinarizeasseveral factorssuchaspaper texture,
aging, thickness, translucidity, permeability, the kind of ink, its fluidity, color, aging, etc. allmay
influence theperformanceof thealgorithms. Besidesall that,manyhistoricaldocumentswerewritten
orprintedonbothsidesof translucentpaper,givingrise to theback-to-front interference.
Thispaperpresentsanewbinarizationschemebasedonthebilateralfilter.Experimentsperformed
in threedatasetsof“realworld”historicaldocumentswith twenty-threeotherbinarizationalgorithms.
Imagequalityandprocessingtimefigureswereprovided,at least for the top10algorithmsassessed.
Theresultsobtainedshowedthat theproposedalgorithmyieldsgoodqualitymonochromatic images
thatmaycompensate itshighcomputationalcost. Thispaperprovidesevidence thatnobinarization
algorithmisan“all-kind-of-document”winner,astheperformanceofthealgorithmsvarieddepending
of thespecific featuresofeachdocument. Amuch larger test setof syntheticabout250,000 images
is currently under development, such a test set will allowmuch better training of the Decision
MakingandImageClassifierblocksof thebilateralalgorithmpresented. Theauthorsarecurrently
attemptingto integrate theDecisionMakingandImageClassifierblocks insuchawaytoanticipate
the choice of thebest component image. Thiswouldhighly improve the timeperformanceof the
proposedalgorithm.
Figure7.Twodocuments fromDIBCOdataset: (left-top)original image(left-bottom)binary image
obtainedusingthebilateralfilterbest result (P(f/f)=97.05,P(b/b)=99.88); (right-top)original image.
(right-bottom) theworstbinarizationresults for thebilateralfilter (P(f/f)=25.93,P(b/b)=99.99).
Theauthorsof thispaperarepromotingaparamountresearcheffort toassess the largestpossible
numberofbinarizationalgorithms for scanneddocumentsusingover5.4millionsynthetic images
in theDIB-Document ImageBinarizationplatform.Animagematcher,amoregeneralandcomplex
versionof theDecisionMakingblock, isalsobeingdevelopedandtrainedwiththat largesetof images,
inorder towhenever fedwitharealworld image, tobeable tomatchwith themostsimilarsynthetic
one. Once thatmatch ismade, themost suitable binarization algorithmsare immediatelyknown.
If thispaperwereaccepted,all the test imagesandalgorithmswillbe includedintheDIBplatform.
Thepreliminaryversionof theDIB-Document ImageBinarizationplatformandwebsite ispublicly
availableathttps://dib.cin.ufpe.br/.
Acknowledgments: Theauthors of this paper are grateful for the refereeswhose commentsmuchhelped in
improvingthecurrentversionofthispaperandtothoseresearcherswhomadethecodeoftheiralgorithmspublicly
availablefortestingandperformanceanalysisandtotheDIBCOteamfrommakingtheir imagespubliclyavailable.
Theauthorsalsoacknowledgethepartialfinancial supportof toCNPqandCAPES—BrazilianGovernment.
26
zurück zum
Buch Document Image Processing"