Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
International
Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“
Page - 28 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 28 - in Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“

Image of the Page - 28 -

Image of the Page - 28 - in Proceedings - OAGM & ARW Joint Workshop 2016 on

Text of the Page - 28 -

Asrepresentationsand learningschemeshavegrowncapableofaccommodating thesheervariability inthedata, thisprogressisalsoimposingnewrequirementsontheemployeddatasets. Current learned modelsareoftenoptimizedforspecificdatasetstheyhavebeentrainedon,andtheircapturemodalities are restricted by their implicit design. Real world scenarios are highly diverse, therefore, a single dataset solely represents a small fraction of all possible visual appearances. Although datasets have become more elaborate and diverse lately [17], class coverage, balancing and variability are still relevant issues tobe tackled. Motivatedbythediversity in thecharacteristicsofprevailingdatasets, in termsofnumberandgranularityofannotatedclassesandscene-specificviewattributes,weproposeto capture the spatial relationship between various semantically labeled regions across several datasets. Wedemonstrate that themodeledspatialpriorcanenhancerecognitionaccuracies leading tostate-of- the-art results, as illustrated inFigure1. 2. RelatedWork Spatial context is an important type of information in the human cognitive process [12] when recog- nizing objects, especially in the presence of a cluttered background. Certain objects predominantly co-occur in the real world. Thus analyzing vast amounts of visual data can result in meaningful contextual statistics which can beused to robustifyvisualobject recognition [5]. Pixel-wise semantic labeling is a relatively novel domain since large-scale object recognition with shared informativerepresentations isaprerequisite for this task. Startingwithmanuallyselected low- level features,discriminatively trainedRandomForestsorBoostinghavebeenusedtoperformclassi- ficationpatch-wise [16]or toadditionally incorporate local structural informationwithin theanalysis patch[7]. Basedonrecentadvances indeep learning, several frameworks [13,18]havedemonstrated significant improvements in the accuracy of per-pixel class estimates. Recently, multi-scale deep ar- chitectures have been proposed in order to represent local and global context by employing multiple input images at different resolutions [2], or combining feature maps from different layers of the con- volutionalarchitecture [6]. Both techniquesaimtocombinefinedetail representationswith relational information established at a coarse resolution level in order to generate accurate segment bound- aries between labeled regions. The immense representational power of deep convolutional architec- turescaptures richdetailsof theobject classes tobe representedandyields segmentation frameworks which surpass learned hand-crafted representations. Capturing spatial context within convolutional architectures, however, is linked with complexities in terms of training (augmented parameter space) and increasedcomputational expensedue to thecomputation ofmultiple scale-specific features. Our proposed approach employs a previously learned spatial prior model as an additional step to switch class labels at locations where per-pixel estimates are ambiguous. We term our model as the Explicit Priors model. Per-pixel ambiguity is quantified from class posterior probabilities at the givenpixelbyexamining thedistancebetweenfirstandsecondrankprobabilities. Ourmethod,while limited in representing spatial context at a wide range of spatial scales and orientations, yields a remarkable improvementat a negligible increaseof computational complexity. 3. Methodology andExperimentalSetup The proposed approach for combining learned information from multiple datasets and thereby en- hancing existing classifiers is based on the concept of Explicit Priors. By aggregating statistical data onthe levelof individualpixelsandcapturingspatialcontext,wegenerateadditionalcuesfor training 28
back to the  book Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“"
Proceedings OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“
Title
Proceedings
Subtitle
OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“
Authors
Peter M. Roth
Kurt Niel
Publisher
Verlag der Technischen Universität Graz
Location
Wels
Date
2017
Language
English
License
CC BY 4.0
ISBN
978-3-85125-527-0
Size
21.0 x 29.7 cm
Pages
248
Keywords
Tagungsband
Categories
International
Tagungsbände

Table of contents

  1. Learning / Recognition 24
  2. Signal & Image Processing / Filters 43
  3. Geometry / Sensor Fusion 45
  4. Tracking / Detection 85
  5. Vision for Robotics I 95
  6. Vision for Robotics II 127
  7. Poster OAGM & ARW 167
  8. Task Planning 191
  9. Robotic Arm 207
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Proceedings