Page - 191 - in Document Image Processing
Image of the Page - 191 -
Text of the Page - 191 -
J. Imaging 2018,4, 32
segmentationtask, thebestF-score,90%,wasobtainedbyMishraetal. [30]. Thealgorithmismainly
basedontwosteps: aGMMrefinementusingstrokeandcolor featuresandagraphcutprocedure.
TheKAISTdataset [31]consistsof3000 images taken in indoorandoutdoorscenes (seeFigure2d
forexamples). This isamultilingualdataset,which includesEnglishandKoreantexts.KAISTcanbe
usedforbothdetectionandsegmentationtasks,as itprovidesbinarymasks foreachcharacter in the
image. The text segmentationalgorithmofZhuandZhang[32]outperformsexistingmethodsonthis
datasetwithanF-scoreof88%.Themethodisbasedonsuperpixel clustering. First, anadaptiveSLIC
textsuperpixelgenerationprocedure isperformed.Next,aDBSCAN-basedsuperpixelclustering is
usedto fusestrokesuperpixels. Finally,astrokesuperpixelverificationprocess isapplied.
TheNEOCRdataset [33] contains 659natural scene imageswithmulti-oriented texts of high
variability (see Figure 2c for examples). This database is intended for scene text recognition and
providedmultilingualevaluationenvironments,as it includes texts ineightEuropeanlanguages.
In2016,Veitetal. [34]proposedadataset forEnglishscene textdetectionandrecognitioncalled
COCO-Text. Thedataset isbasedontheMicrosoftCOCOdataset,whichcontains imagesofcomplex
everydayscenes. Thebestresultonthisdataset (67.16%)wasobtainedbythewinnerof theCOCO-Text
ICDAR2017competition[35].Note that theparticipatingmethodsonthiscompetitionwereranked
basedontheirAverageprecision(AP)withanIntersectionoverUnion(IoU)of0.5.
Recently,ChngandChan[36] introducedanewdataset,namelyTotal-text, forcurvedscene text
detectionandrecognitionproblems. It contains1555scene imagesand9330annotatedwordswith
threedifferent textorientations.
Figure 3. Some examples of text detection systems [18–20] showing the evolution of this area of
researchover tenyears.
As forArabic language,majorcontributionshavealreadybeenmadeintheconventionalfield
ofprintedandhandwrittenOCRsystems[7,10].Muchprogressofsuchsystemshasbeentriggered
thanks to theavailabilityofpublicdatasets. Examples include the IFN/ENIT[37]andKHATT[38]
datasets for offline handwriting recognition andwriter identification; theAPTI database [39] for
printedwordrecognition;andtheADABdataset [40] thatworksononlinehandwritingrecognition.
However,handlingArabic textdetectionandrecognition formultimediadocuments is limitedto
veryfewstudies [41–43].
Table 1 presents commonly used datasets for text processing in images and videos,
and summarizes their features in terms of text categories, sources, tasks, script, information of
191
back to the
book Document Image Processing"
Document Image Processing
- Title
- Document Image Processing
- Authors
- Ergina Kavallieratou
- Laurence Likforman-Sulem
- Editor
- MDPI
- Location
- Basel
- Date
- 2018
- Language
- German
- License
- CC BY-NC-ND 4.0
- ISBN
- 978-3-03897-106-1
- Size
- 17.0 x 24.4 cm
- Pages
- 216
- Keywords
- document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
- Category
- Informatik