Seite - 161 - in Document Image Processing
Bild der Seite - 161 -
Text der Seite - 161 -
J. Imaging 2018,4, 39
Table3.Classificationresults forElliptical featuresetwithMLPClassifier.
Class Class A B C D E F G H I J K L R
A 355 29 49 48 0 25 3 4 42 10 6 29 2
B 2 550 0 7 0 1 32 0 0 0 0 8 27
C 27 0 479 8 1 19 2 11 31 1 8 13 32
D 32 9 0 514 0 13 23 0 3 2 4 0 66
E 66 1 2 1 441 42 4 7 10 20 4 2 96
F 96 3 6 15 6 397 16 4 19 12 14 12 55
G 55 13 7 54 6 17 402 1 3 19 22 1 25
H 25 0 2 3 0 26 0 491 28 10 4 11 7
I 7 0 23 3 33 8 7 3 493 10 4 9 0
J 0 0 1 0 16 5 2 2 9 553 6 6 2
K 2 0 16 7 1 7 12 0 2 7 546 0 8
L 8 22 1 0 6 6 20 13 6 9 1 508 0
Now, theconfidencevaluesprovidedto theclasses forevery inputdatabytheclassifiersonthe
threesetsof features formthe input for theclassifiercombinationprocedures. Theconfusionmatrix
resultingfromtheMajorityvotingprocedure ispresented inTable4.Anoverallaccuracyof95.6%is
achievedonthisdatasetcontaining7200samplesdividedequallyamongthe12scriptclasses. It isseen
thatDevanagariscripthasgot the leastaccuracyandgetsconfusedwithTeluguwhereashighaccuracies
areshownforManipuriandOdiaandBangla.
Borda count algorithmgives anaccuracyof 93.5%which is an increase of 2.1%over thebest
performingindividualclassifier. Itprovides thehighest recognitionrate forDevanagariamongall the
combinationschemesandgoodaccuracies forotherpopularscripts likeBanglaandOdiaandhence
canbe thepreferredchoice forwideusage. The trainableversionof thealgorithmwithweightsbased
onoverallaccuracyof theclassifiers improves theresults further. The increase is2.9%withsatisfactory
results forscripts likeTelugu,KannadaandUrdu. Theaccuracyfor theGurumukhiscript remains low
irrespectiveof theweights. Theresultsarepresented inTables5and6.
Thesimplerulesat themeasurement level tocombine thedecisionsprovidegoodresults in the
presentwork. Thesumruleattainsanaccuracyof 97.76%withalmost close toperfect recognition
forUrdu,Gurumukhi andRoman. The product rule andmax rule have accuracies of 95.73% and
94.60%respectively.Highestaccuracy is foundforOdiascriptwhereasproduct rulesuffers incaseof
Gurumukhiandmaxrule incaseofDevanagari. Theresults for theelementaryrulesofcombinationare
tabulated inTables7–9.
Sumruleoutperformsallother rulebasedcombinationapproaches in thisworkandtestifies the
resultspresentedbyKittler et al. mentioned in [38]bybeing lessprone tonoise anduncleandata.
TheDStheoryresults combine theresults, twoata timeandthenall three together. Theclass-wise
performancebasedBPA,whichoutperformstheglobalperformancebasedBPA,hasbeenimplemented
for themulti-classifier combinationusing theDS theory [45]. The rule applied for this process is
quasi-associativeandhence theresultsofcombiningtwosourcescannotbecombinedwith the third.
Therulehas tobeextendedto includeall the threesources together. Results for thecombinationof the
classifier resultsonHOGandElliptical features,MLGandElliptical features, and,HOGandMLG
featuresarepresented inTables10, 11and12respectively. Thecombinationresult includingall the
threesourcesof information isgiven inTable13.
There isno improvementshownbythecombinationof theresults fromMLGandHOGfeature
sets. But when the Elliptical feature set is involved in the combination process there is much
improvementover theparticipatingclassifiers.Overallaccuraciesof91.2%and97.04%areachieved
bycombiningsourceshaving78.1%and79.4%accuraciesand91.4%and79.4%accuracies respectively.
So, improvementsof6%and10%are foundbyapplying theDStheoryofevidence. Combiningall
three,anaccuracyof95.64%,morethan4%over thebetterperformingclassifier isseen. Inboththe
schemesall the script classeshaveaccuraciesover90%andwithalmost 100%accuracy for certain
161
zurück zum
Buch Document Image Processing"
Document Image Processing
- Titel
- Document Image Processing
- Autoren
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2018
- Sprache
- deutsch
- Lizenz
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 216
- Schlagwörter
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Kategorie
- Informatik