Page - 154 - in Document Image Processing
Image of the Page - 154 -
Text of the Page - 154 -
J. Imaging 2018,4, 39
2.2.HistogramofOrientedGradients (HOG)
HOGdescriptor [31]countsoccurrencesofgradientorientation in localizedportionsofan image
whichwasfirstproposed forpedestriandetection in steady images. Theessential thoughtbehind
theHOGdescriptor is that localobjectappearanceandshapewithinan imagecanbedescribedby
thedistributionof intensitygradientsoredgedirections. Atfirst, thevaluesof themagnitudeand
directionofall thepixels foreachof thewordimagesarecalculated.Next, eachpixel ispigeonholed
in certain category according to its directionwhich is knownas orientationbins. Then, theword
image isdividedinton (heren=10)connectedregions, calledcellsandforeachcell, ahistogramof
gradientdirectionsoredgeorientations iscomputedfor thepixelswithin thecell. Thecombinationof
thesehistogramsthenrepresents thedescriptor. Since thenumberoforientationbins is takenas8 for
thepresentwork,an80-D(i.e., 10×8) featurevectorhasbeenextractedusingHOGdescriptor [30].
ThemagnitudeanddirectionofeachpixelofasamplehandwrittenTeluguwordimagearealsoshown
inFigure4.
(a) (b) (c)
Figure 4. Illustration of: (a) handwritten Telugu word image, (b) its magnitude part and (c) its
directionpart.
2.3.ModifiedLog-GaborFilterTransform(MLGTransform)
Modifiedlog-Gaborfilter transform-basedfeatures,proposedinReference [20],hadperformed
well in thescriptclassificationtaskandthereforearealsochosenasoneof the featuredescriptorsof
ourproposedmethodology inorder to identify thescriptof theword images. Inorder topreserve the
spatial information,aWindowedFourierTransform(WFT) isconsideredinthepresentwork.WFT
involvesmultiplicationof the imageby thewindowfunctionand the resultantoutput is followed
byapplying theFourier transform.WFTisbasicallya convolutionof the imagewith the low-pass
filter. Since for textureanalysis,bothspatialandfrequencyinformationarepreferred, thepresentwork
tries toachieveagoodtrade-offbetweenthese two.Gabor transformsuseaGaussianfunctionas the
optimally concentrated function in the spatial aswell as in the frequencydomain [32]. Due to the
convolutiontheorem, thefilter interpretationof theGabor transformallowstheefficientcomputation
oftheGaborcoefficientsbymultiplicationoftheFourier transformedimagewiththeFourier transform
of theGaborfilter. The inverseFourier transformis thenappliedon the resultantvector toget the
outputfiltered images.
The images,after lowpassfiltering,arepassedas input toa function thatcomputesGaborenergy
feature fromthem.The input image is thenpassedtoafunctiontoyieldaGaborarraywhich is the
arrayequivalentof the imageafterGaborfiltering. Thefunctiondisplays the imageequivalentof the
magnitudeandtherealpartof theGaborarraypixels.
For thepresentwork,bothenergyandentropyfeatures [33]basedonModifiedlog-Gaborfilter
transformhavebeenextractedfor5scales(1,2,3,4and5)and6orientations(0◦, 30◦, 60◦, 90◦, 120◦and
150◦) tocapturecomplementary informationfoundindifferent scriptword images.Here, eachfilter is
convolvedwith the input imagetoobtain60different representations (responsematrices) foragiven
input image. Figure5showsoutput images formedafter theapplicationofModifiedlog-Gaborfilter
transformforasamplehandwrittenBanglawordimage.
154
back to the
book Document Image Processing"
Document Image Processing
- Title
- Document Image Processing
- Authors
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Editor
- MDPI
- Location
- Basel
- Date
- 2018
- Language
- German
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Size
- 17.0 x 24.4 cm
- Pages
- 216
- Keywords
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Category
- Informatik