Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Seite - 123 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 123 - in Document Image Processing

Bild der Seite - 123 -

Bild der Seite - 123 - in Document Image Processing

Text der Seite - 123 -

J. Imaging 2018,4, 43 Figure28.Errorrate forSundanesewordrecognitionandtransliterationtest set. 6.ConclusionsandFutureWork A comprehensive experimental test of the principal tasks in a DIA system, starting with binarization, text linesegmentation,andisolatedcharacter/glyphrecognition,andcontinuingonto wordrecognitionand transliteration foranewcollectionofpalm leafmanuscripts fromSoutheast Asia, is presented. The results fromall experimentsprovide the latest findings andaquantitative benchmarkofpalmleafmanuscriptsanalysis forresearchers in theDIAcommunity. Binarizingthe palmleafmanuscript imagesseemsverychallenging. Still,withmanybrokenandunrecognizable characters/glyphsandnoisesdetected in the images, binarizationshouldbe reconsidered thefirst step in theDIAprocess forpalm leafmanuscripts. On theotherhand, although there are already training-basedDIAmethodsthatdonotrequirethisbinarizationprocess, theyusuallyrequireadequate trainingdata. Theproblemof inadequate trainingdataalso influencesglyphrecognitionandword transliteration. Theunbalancednumberof imagesamples foreachcharacter classmeans theCNN methodsdidnotperformoptimally inglyph recognition. Thedifferences in the recognition rates of theCNNmethodsarenot toosignificantwith thehandcrafted featurecombinations. For future work,more syntheticdata training forpalm leafmanuscript images shouldbegenerated inorder to support the trainingprocess. Especially for theword transliteration task,more synthetic data trainingwithamore frequentword shouldbegenerated inorder to improve the trainingprocess. Many examples of glyph-to-syllable association shouldbe synthetically generated to transliterate syllabic scripts fromSoutheastAsia. The special characteristics andchallengesposedby thepalm leafmanuscript collectionswill require a thorough adaptation of theDIA system. Some specific adjustmentsneedtobeapplied to theDIAmethods forother typesofdocuments. Theadaptationofa DIAforpalmleafmanuscripts isnotuniqueandisnotuniversal forall typesofproblemfromdifferent collections.However,amongtheDIAsystem’snon-uniquesolutions,onespecificsolutioncanstill bedesignedtodeliver themostoptimalDIAsystemperformancewhilestill taking intoaccount the conditionsof thatcollection. Acknowledgments:Theauthorswould like to thankMuseumGedongKertya,MuseumBali,UndangAhmad Darsa, thephilologists fromSundaneseCentreStudiesofUniversitasPadjadjaran, theSitusKabuyutanCiburuy Garut,all families inBali, Indonesia, theEFEOteam, theBuddhist Institute,andtheNationalLibraryinCambodia for providing uswith samples of palm leafmanuscripts. We also thank the students from theDepartment of Informatics Education and theDepartment of BalineseLiterature,University of PendidikanGanesha, the InstituteofTechnologyofCambodia,andtheNational InstituteofPost,TelecommunicationandICTforhelping uswith the ground truthingprocess for this researchproject. Thiswork is supported by theDIKTI BPPLN IndonesianScholarshipProgram, theSTICAsiaProgramimplementedbytheFrenchMinistryofForeignAffairs andInternationalDevelopment (MAEDI), andARES-CCD(programAI2014-2019)under the fundingofBelgian universitycooperation,andDRPMIUniversitasPadjadjaran,DIKTI InternationalCollaborationandPublication grant2017. Author Contributions: The Balinese dataset was prepared byMadeWinduAntara Kesiman. The Khmer datasetwaspreparedbyDonaValyandSopheaChhun.TheSundanesedatasetwaspreparedbyErickPaulus, MiraSuryani, andSetiawanHadi. Jean-ChristopheBurie,MichelVerleysen,andJean-MarcOgiercontributedto designingagroundtruthvalidationprotocol.MadeWinduAntaraKesimanandDonaValyconceived,designed, 123
zurück zum  Buch Document Image Processing"
Document Image Processing
Titel
Document Image Processing
Autoren
Ergina Kavallieratou
Laurence Likforman-Sulem
Herausgeber
MDPI
Ort
Basel
Datum
2018
Sprache
deutsch
Lizenz
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Abmessungen
17.0 x 24.4 cm
Seiten
216
Schlagwörter
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie
Informatik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing