Page - 64 - in Document Image Processing
Image of the Page - 64 -
Text of the Page - 64 -
J. Imaging 2018,4, 6
Figure3.DCTbasedFeatureExtraction.
3.2.DiscreteCosineTransform4-Blocks (DCT_4B)
In this featureset,firstlywefindtheCentreofGravity (COG)of imageandmake itas thestarting
point; inorder tocalculate thecentreofgravity, thehorizontalandvertical centremustbedetermined
bythe followingequations:
Cx= M(1,0)
M(0,0) (2)
Cy= M(0,1)
M(0,0) (3)
whereCx is the horizontal centre andCy the vertical centre of gravity and M(p,q) the geometrical
momentsof rank p+q:
Mpq=∑
x ∑
y ( x
width )p( y
height )q f(x,y). (4)
Thexandydeterminethe imagewordpixels. Thedivisionofxandybythewidthandtheheight
of the image, respectively, causes thegeometricalmoments tobenormalizedandbe invariant to the
sizeof theword[18]. Thismethoduses featuresofCOGandDCTat thesametime, thefirstoneasan
auxiliary feature todivide the image into fourpartsandapplythesecondfeatureDCToneachpart
asawhole.
This featureset isextractedandimplementedas follows:
1. Calculate the COG of the word image and make it as a starting point as explained in
Equations (1)–(4).
2. Use theverticalandhorizontalCOGtodivide thewordimage into fourregions.
3. Apply theDCTtoeachpartof thewordimage.
4. PerformzigzagoperationontheDCTcoefficientsofeach imagepart toget thefirstN/4values
thatcontainmostwordinformationonthatwordpart.
5. RepeatSteps3and4sequentially forall thewordparts, andthencombine themtogether to form
thefeaturevectorof thewordimage.
3.3.HybridDCTandDCT_4B(DCT+DCT_4B)
This featurecombines the twofeaturesDCTandDCT_4B.
4. LexicalReductionandClustering
To reduce the computation time for searching the whole lexicon in the recognition phase,
thesimilarshapewordsareclusteredtogether. Thewordsearch isperformedin twosteps. In thefirst
one, thewordclusteror thenearestn-clustersaredeterminedthenthebestmatchingwordinside that
clusterareselectedas therecognitionoutput. Forwordsclustering,weusedtheLBGalgorithm[19] to
cluster thewords ineachgroupdependingonclosenessof thewordshapes fromthepointofviewof
theusedfeatures. For theclusteringprocess,weusedthesameDCTandDCT_4Bfeatures thatweuse
for thewordrecognitionphase.
Tomeasure theaccuracyof theclusteringstep,andalso lexical reduction,weusedaclustering
accuracy measure which counts the number of times the test word exists within the selected
cluster/clustersper the testedwords. Foravocabularysizeofaround356,000wordsofSimplified
Arabicfont(14pt.),wetestedtheclusteringaccuracyusingatestsetof3465wordsandacodebooksize
64
back to the
book Document Image Processing"
Document Image Processing
- Title
- Document Image Processing
- Authors
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Editor
- MDPI
- Location
- Basel
- Date
- 2018
- Language
- German
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Size
- 17.0 x 24.4 cm
- Pages
- 216
- Keywords
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Category
- Informatik