Page - 104 - in Document Image Processing
Image of the Page - 104 -
Text of the Page - 104 -
J. Imaging 2018,4, 43
2.2. KhmerPalmLeafManuscriptsâCollection fromCambodia
2.2.1.Corpus
InCambodia,Khmerpalmleafmanuscripts (Figure2)arestill seen inBuddhistestablishments
andare traditionallyusedbymonksasreadingscriptures.Various librariesandinstitutionshavebeen
collectinganddigitizingthesemanuscriptsandhaveevensharedthedigital imageswith thepublic.
For instance, theĂcoleFrançaisedâExtrĂȘme-Orient (EFEO)has launchedanonlinedatabase (http:
//khmermanuscripts.efeo.fr) [20]ofmicroïŹlmimagesofhundredsofKhmerpalmleafmanuscript
collections. SomedigitizedcollectionsarealsoobtainedfromtheBuddhist Institute,whichisoneofthe
biggest institutes inCambodiaresponsible for researchonCambodian literatureandlanguagerelated
toBuddhism,andalso fromtheNationalLibrary (situated in thecapital city,PhnomPenh),which
ishometoa largecollectionofpalmleafmanuscripts.Moreover,astandarddigitizationcampaign
wasconducted inorder tocollectpalmleafmanuscript images foundinBuddhist temples indifferent
locations throughoutCambodia: PhnomPenh,Kandal,andSiemReap[21].
Figure2.Khmerpalmleafmanuscript.
2.2.2.KhmerScriptandLanguage
According to the eraduringwhich thedocumentswere created, slightlydifferentversionsof
Khmercharactersareused in thewritingofKhmerpalmleafmanuscripts. TheKhmeralphabet is
famousfor itsnumeroussymbols (~70), includingconsonants,different typesofvowels,diacritics,and
special characters.Certainsymbolsevenhavemultipleshapesandformsdependingonwhatother
symbolsarecombinedwith themtocreatewords. The languageswrittenonpalmleafdocumentsvary
fromKhmer, theofïŹcial languageofCambodia, toPali andSanskrit, bywhich themodernKhmer
languagewasconsiderably inïŹuenced.OnlyaminorityofCambodianpeople, suchasphilologists
andBuddhistmonks,areable toreadandunderstandthe latter languages.
2.3. SundanesePalmLeafManuscriptsâCollection fromWest Java, Indonesia
2.3.1.Corpus
The collection of Sundanese palm leafmanuscripts (Figure 3) comes from Situs Kabuyutan
Ciburuy,Garut,West Java, Indonesia. TheKabuyutanCiburuy isacomplexculturalheritage from
PrabuSiliwangiandPrabuKianSantang,thekingandthesonofthePadjadjarankingdom.Thecultural
complex consists of six buildings. One of them is Bale Padaleuman, which is used to store the
Sundanesepalmleafmanuscripts. TheoldestSundanesepalmleafmanuscript inSitusKabuyutan
Ciburuy came from the 15th century. In Bale Padaleuman, there are 27 collections of Sundanese
manuscripts. Eachcollectioncontains15to30pages,withdimensionsof25â45cminlengthĂ10â15cm
inwidth[22].
2.3.2. SundaneseScriptandLanguage
TheSundanesepalmleafmanuscriptswerewritten in theancientSundanese languageandscript.
Thecharactersconsistofnumbers,vowels (suchasa, i,u, e, ando),basiccharacters (suchasha,na,
104
back to the
book Document Image Processing"
Document Image Processing
- Title
- Document Image Processing
- Authors
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Editor
- MDPI
- Location
- Basel
- Date
- 2018
- Language
- German
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Size
- 17.0 x 24.4 cm
- Pages
- 216
- Keywords
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Category
- Informatik