Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Informatik
Document Image Processing
Seite - 51 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 51 - in Document Image Processing

Bild der Seite - 51 -

Bild der Seite - 51 - in Document Image Processing

Text der Seite - 51 -

J. Imaging 2018,4, 57 threshold thideal tobeat aroundavalueof 100. Wehave takenvarious thresholdvalues from5 to 115andfoundexperimentally that theaccuracyofclassification ismaximumatabouta thresholdof 100. It is tobenoted thatwehave set thishardcore thresholdvalueafter conductingaexhaustive experimentationonthe imagesbelongingtoourdataset.Achange indocument imagesmightchange the thresholdvalueabit,but,weforetell that, thisassumptionwouldgive theresearchersaclearhint toset the thresholdvalue for thedocument images theyconsider. 3.Method Theinputcolorimageisfirstconvertedtothegrayscaleimageandthentheconnectedcomponents (CCs) are extracted for feature computation and classification. The entire process is depicted in Figure6. ForCCextraction,first thegrayscale image isbinarizedandtheboundingboxes (BBs)of all of the eight-connected components in the binarized image are calculated. Then, using these estimated bounding boxes, CCs from the corresponding grayscale image are extracted. As we are considering real-world handwritten documents, we need to be very careful about the noise present in thesedocuments,whichmightaffect thebinarizationandBBestimationprocess. Thus, for effectivebinarization, abackgroundestimationandseparationprocedure is followed,prior to the actualbinarization,usingOtsu’smethodasgiven in [27].DuringBBestimationfromthebinarized image,only theCCshavingheightandwidthgreater thanthreepixelsareconsideredtoavoidnoise. Afterextractionof theCCsfromthegrayscale image, sixdifferentLBPbasedfeaturesarecomputed. Duringfeaturecomputation, theradiusRhasbeenkeptconstantat1 (i.e., thenumberofneighboring pixelsM= 8). Inorder to computea featurevector for eachCC,wehavegeneratedanormalized histogramof those LBPvalues. The number of bins useddepends on the particular LBPvariant considered. Here,weshouldalsopointout that theLBPoperatorshavebeenapplied toeachand everypixelofaCC,withoutanydiscrimination. Figure6.Flowchartof theentire text/non-text separationprocess. 4. ExperimentalSetup Experimentalsetupforanypatternclassificationproblemrequiresanannotateddataset,classifiers and a set of evaluationmetrics. In this section, the data preparationprocedure is describedfirst, 51
zurück zum  Buch Document Image Processing"
Document Image Processing
Titel
Document Image Processing
Autoren
Ergina Kavallieratou
Laurence Likforman-Sulem
Herausgeber
MDPI
Ort
Basel
Datum
2018
Sprache
deutsch
Lizenz
CC BY-NC-ND 4.0
ISBN
978-3-03897-106-1
Abmessungen
17.0 x 24.4 cm
Seiten
216
Schlagwörter
document image processing, preprocessing, binarizationl, text-line segmentation, handwriting recognition, indic/arabic/asian script, OCR, Video OCR, word spotting, retrieval, document datasets, performance evaluation, document annotation tools
Kategorie
Informatik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Document Image Processing