Seite - 20 - in Document Image Processing
Bild der Seite - 20 -
Text der Seite - 20 -
J. Imaging 2018,4, 27
The background-background probability is a function that needs to be optimized in the
decision-making block, mapping background pixels (paper) from the original image ontowhite
pixelsof thebinary image. Itdependsofall theparametersof theoriginal imagetexture, strengthof
thebacktofront interference (simulatedbythecoefficientα),paper translucidity,etc. foreachRGB
channel. Thus,onecanrepresent thisdependenceas:
P(b/b)=f(α,R,G,B). (1)
Theoptimal threshold tc* foreachchannel is calculated in thedecision-makingblock, the indexc
canbeR,GorB,maximizingP(b/b):
tc*=MaxP(b/b), (2)
subject toagivencriterionP(f/f)≥M.ThecriterionusedherewasM=97%, that is atmost3%of
the foregroundpixelsmaybe incorrectlymapped. During the trainingphase, the best tc*will be
chosenfromthethreechannels,whichbestmaximizes theP(b/b) foreachof the images in the training
set. Thematrixof co-occurrenceprobability is calculatedand thedecisionmaker chooses thebest
binary image. Thedecision-makingblockwas trainedwith32,000synthetic images insuchawayto,
givenareal imagetobebinarized, itfinds theoptimal thresholdparameters.
2.2.GeneratingSynthetic Images
TheDecision-MakingBlockneedstrainingto“learn”about theoptimal thresholdparametersand
thevalueof thekernel tobeusedinthebilateralfilter. Suchtrainingmustbedoneusingcontrolled
imageswhicharesynthesizedtomimic thedifferentdegreesofback-to-front interference,paperaging,
paper translucidity, etc. Figure4presents theblockdiagramfor thegenerationofsynthetic images.
Twobinary imagesofdocumentsofdifferentnature (typed,handwrittenwithdifferentpens,printed,
etc.) are taken: F—frontandV—verso (back). The front image isblurredwithaweakGaussianfilter to
simulate thedigitalizationnoise [1], thehues thatappear inafterdocumentscanning.
Figure4.BlockdiagramoftheschemeforthegenerationofsyntheticimagesfortheDecision-MakingBlock.
20
zurück zum
Buch Document Image Processing"
Document Image Processing
- Titel
- Document Image Processing
- Autoren
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2018
- Sprache
- deutsch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 216
- Schlagwörter
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Kategorie
- Informatik