Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Page - 101 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 101 - in Document Image Processing

Image of the Page - 101 -

Image of the Page - 101 - in Document Image Processing

Text of the Page - 101 -

Journal of Imaging Article BenchmarkingofDocumentImageAnalysisTasksfor PalmLeafManuscripts fromSoutheastAsia MadeWinduAntaraKesiman1,2,*,DonaValy3,4, Jean-ChristopheBurie1,ErickPaulus5, MiraSuryani 5,SetiawanHadi5,MichelVerleysen2,SopheaChhun4 andJean-MarcOgier 1 1 Laboratoire InformatiqueImageInteraction(L3i),UniversitédeLaRochelle, 17042LaRochelle,France; jean-christophe.burie@univ-lr.fr (J.-C.B.); jean-marc.ogier@univ-lr.fr (J.-M.O.) 2 LaboratoryofCultural Informatics (LCI),UniversitasPendidikanGanesha,Singaraja,Bali81116, Indonesia; michel.verleysen@uclouvain.be 3 Instituteof InformationandCommunicationTechnologies,Electronic, andAppliedMathematics (ICTEAM), UniversitéCatholiquedeLouvain,1348Louvain-la-Neuve,Belgium;dona.valy@student.uclouvain.be 4 Departmentof InformationandCommunicationEngineering, InstituteofTechnologyofCambodia, PhnomPenh,Cambodia; sophea.chhun@itc.edu.kh 5 DepartmentofComputerScience,UniversitasPadjadjaran,Bandung45363, Indonesia; erick_paulus@yahoo.com(E.P.);mira.suryani@unpad.ac.id (M.S.); setiawanhadi@unpad.ac.id (S.H.) * Correspondence:made_windu_antara.kesiman@univ-lr.fr Received: 15December2017;Accepted: 18February2018;Published: 22February2018 Abstract:Thispaperpresentsacomprehensive testof theprincipal tasks indocument imageanalysis (DIA), startingwithbinarization, text linesegmentation,andisolatedcharacter/glyphrecognition, andcontinuingontowordrecognitionandtransliterationforanewandchallengingcollectionof palmleafmanuscripts fromSoutheastAsia. This researchpresentsandisperformedonacomplete datasetcollectionofSoutheastAsianpalmleafmanuscripts. It contains threedifferentscripts:Khmer script fromCambodia,andBalinesescriptandSundanesescript fromIndonesia. Thebinarization task is evaluatedonmanymethodsup to the latest in somebinarizationcompetitions. Theseam carvingmethodisevaluatedfor the text linesegmentation task, comparedtoarecentlynewtext line segmentationmethodforpalmleafmanuscripts. For the isolatedcharacter/glyphrecognitiontask, theevaluation is reportedfromthehandcraftedfeatureextractionmethod, theneuralnetworkwith unsupervisedlearningfeature,andtheConvolutionalNeuralNetwork(CNN)basedmethod. Finally, theRecurrentNeuralNetwork-LongShort-TermMemory (RNN-LSTM)basedmethod isused to analyze thewordrecognitionandtransliteration task for thepalmleafmanuscripts. Theresults from allexperimentsprovide the latestfindingsandaquantitativebenchmarkforpalmleafmanuscripts analysis for researchers in theDIAcommunity. Keywords: document imageanalysis; binarization; character recognition; text line segmentation; wordrecognition; transliteration;palmleafmanuscript;dataset;benchmark;experimental test 1. Introduction Since theworld entered the digital age in the early 20th century, the need for a document imageanalysis (DIA)systemis increasing. This isdueto thedramatic increase inefforts todigitize thevarious typesofdocument collectionsavailable, especially theancientdocumentsofhistorical relics found in various parts of theworld. Some very interesting projects on awide variety of heritage document collections can bementioned here: for example, the tranScriptorium project (http://transcriptorium.eu/) [1]; theREAD(RecognitionandEnrichmentofArchivalDocuments) project (https://read.transkribus.eu/) [2], whichworks on documents from theMiddle Ages to today,andalso focusesondifferent languages ranging fromAncientGreek tomodernEnglish; the J. Imaging 2018,4, 43 101 www.mdpi.com/journal/jimaging
back to the  book Document Image Processing"
Document Image Processing
Title
Document Image Processing
Authors
Ergina Kavallieratou
Laurence Likforman-Sulem
Editor
MDPI
Location
Basel
Date
2018
Language
German
License
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Size
17.0 x 24.4 cm
Pages
216
Keywords
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing