Seite - 115 - in Document Image Processing

Bild der Seite - 115 -

Text der Seite - 115 -

J. Imaging 2018,4, 43 ofconnectedcomponents that representedacorrectcharacter inBalinesescript fromtheword-level binarized images thatweremanuallyannotated[11,17,20]usingAletheia (http://www.primaresearch. org/tools/Aletheia) [62,63] (Figure14). TheSundanesecharacterdatasetwasannotatedmanually [22] (Figure15). For theKhmercharacterdataset, a toolhasbeendevelopedtoannotatecharacters/glyphs on the document page. The polygon boundary of each character is tracedmanually by dotting out itsvertexonebyone. A label isgiven toeachannotatedcharacter after its boundaryhasbeen constructed[21] (Figure16). Table3.Palmleafmanuscriptdatasets for isolatedcharacter/glyphrecognitiontask. Manuscripts Classes Train Test Dataset Balinese 133classes 11,710 images 7673 images AMADI_LontarSet [17,25,28] Khmer 111classes 113,206 images 90,669 images SleukRithSet [21] Sundanese 60classes 4555 images 2816 images SundaDataset [22] Figure14.Balinesecharacterdataset. 115

zurück zum Buch Document Image Processing"

Document Image Processing

Titel: Document Image Processing
Autoren: Ergina Kavallieratou; Laurence Likforman-Sulem
Herausgeber: MDPI
Ort: Basel
Datum: 2018
Sprache: deutsch
Lizenz: CC BY-NC-ND 4.0
ISBN: 978-3-03897-106-1
Abmessungen: 17.0 x 24.4 cm
Seiten: 216
Schlagwörter: document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie: Informatik