Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Seite - 151 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 151 - in Document Image Processing

Bild der Seite - 151 -

Bild der Seite - 151 - in Document Image Processing

Text der Seite - 151 -

J. Imaging 2018,4, 39 need todealwith is thenon-uniformityof theshapeandsizeof thecharacterswrittenbydifferent writers.Alongwiththese,problemslikeskew,slantetc. arecommonlyseeninhandwrittendocuments. Eventhepaperandinkqualitiesmakethingsmuchdifficult.Apart fromthe intrinsiccomplexitiesof handwritings, similaritiesamongthecharactersbelongingtodifferentscriptaugment thechallenges of script recognition fromthehandwrittendocument images. It isworthmentioning that, usually, script recognition isperformedatpage, text-lineoratword level. But in thispaper, this isdoneat word-levelbecauseof tworeasons: (a) featureextractionatword-level is less timeconsumingthanat pageorat text-line leveland(b)sometimes, it is seenthatasingledocumentpageorasingle text line containsmultiplescripts. In thatcase,word-level script identification isappropriate. Script recognition articles for handwrittendocuments are relatively limited in comparison to its printed counterpart. Ubul et al. [2] comprehensively showed the state-of-the-art performance results for different identification, feature extraction and classificationmethodologies involved in theprocess. Recently, Singhetal. [1]providedasurveyconsideringvarious featureextractionand classificationtechniquesassociatedwith theofflinescript identificationof the Indicscripts. Spitz [3] proposed a method for distinguishing between Asian and European languages by analysing the connectedcomponents. Tanet al. [4]developedamethodbasedon textureanalysis for automatic script identification fromdocument imagesusingmultiple channel (Gabor) filters andGray level co-occurrencematrices(GLCM)forsevenlanguages:Chinese,English,Greek,Koreans,Malayalam,Persian andRussian.Hochbergetal. [5,6]describedanalgorithmforscriptandlanguage identificationfrom handwritten document images using statistical features based on connected component analysis. Woodetal. [7]demonstratedaprojectionprofilemethodtodetermineRoman,Russian,Arabic,Korean andChinesecharacters.Chaudhurietal. [8]discussedanOCRsystemtoreadtwoIndianlanguagesviz., BanglaandDevanagari (Hindi). Paletal. [9]proposedanalgorithmforword-wisescript identification fromdocument containingEnglish,Devanagari andTelugu text, based on conventional andwater reservoir features.Chaudhuryetal. [10]proposedamethodfor identificationof Indian languagesby combiningGaborfilterbasedtechniquesanddirectiondistancehistogramclassifier forHindi,English, Malayalam,Bengali,Telugu andUrdu. Someanalysis of thevariability involved in themulti-script signaturerecognitionproblemascomparedto thesingle-script scenario isdiscussed in [11,12]. Variousclassificationalgorithmsareappliedfordifferentpatternrecognitionproblemsandthe samefactalsoapplies to thescript recognitionproblem.Tilldate, for Indicscript recognitionpurpose, differentclassifiershavebeenusedsuchask-NearestNeighbours (k-NN)[13,14],LinearDiscriminant Analysis (LDA) [15],NeuralNetworks (NN) [15,16], SupportVectorMachine (SVM) [16,17], Tree based classifier [18,19], Simple Logistic [20] andMLP [21,22]. Though good results have already beenachieved in this pattern recognition taskbutwith a single classifier it is still hard to achieve acceptableaccuracy. Studiesexpose that the fusionofmultipleclassifierscanbeaviablesolutionto getbetterclassificationresultsas theerroramassedbyanysingleclassifier isgenerallycompensated using information fromother classifiers. The reason for this is that different classifiersmayoffer complementary informationabout thepatternsunderconsideration. Basedonthis fact, since long, a section of researchers has focused ondevising different algorithms for combining classifiers in an intelligentway so that the combination can achieve better results than any of the individual classifier used for combining. The key idea is that instead of relying on a single decisionmaker, all thedesignsor their subsets are applied for thedecisionmakingby combining their individual beliefs in order to come upwith a consensus decision. This factmotivatesmany researchers to apply the classifier combinationmethods to different pattern recognitionproblems. Thepopular methodologies for classifier combination include: MajorityVoting [23,24], Subset-combining and re-rankingapproach[25],Statisticalmodel [26],BayesianBelief Integration[27],Combinationbased onDStheoryofevidence [27,28]andNeuralNetworkcombinator [29]. But tilldate, classifiercombinationapproachforscript recognitionproblem,eitherhandwritten orprinted,hasnotbeentestedmuch, thoughithasenormouspotential. Tobridgethis researchgap, thispaperappliesdifferent classifiercombination techniques in thefieldof Indic script recognition. 151
zurück zum  Buch Document Image Processing"
Document Image Processing
Titel
Document Image Processing
Autoren
Ergina Kavallieratou
Laurence Likforman-Sulem
Herausgeber
MDPI
Ort
Basel
Datum
2018
Sprache
deutsch
Lizenz
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Abmessungen
17.0 x 24.4 cm
Seiten
216
Schlagwörter
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie
Informatik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing