Page - 29 - in Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“

Image of the Page - 29 -

Text of the Page - 29 -

and classification, while remaining independent of the underlying machine learning algorithm. The method and its integration throughout the entire processing pipeline is described in this chapter and demonstratedon theDaimlerUrbanSegmentation2014dataset [14]. Thedataset consistsof imagesequencescapturedbyacameramountedonamovingcar. The images are provided without color information at a resolution of 1024x440 px, with every 10th frame of the sequences being annotated with pixel-wise segmentations. For a reasonable comparison, only the test sequences, as specified by the evaluation protocol, are considered. The dataset is supplemented with precomputed disparity maps and additional information, like time-stamps, vehicle speed and yaw rate. The ground truth distinguishes between two foreground (Vehicle and Pedestrian) and three background classes (Ground, Sky and Building). Within the test data 36.3% of all pixels are defined as Void. The frequency of occurrence of the labeled pixels is 54.1% for Ground, 14.8% for Vehicle, 4.6%for Pedestrian, 2.4%for Sky and 24.0%for Building, resulting inabackground ratioof80.6%. 3.1. Training Dataset Analysis As a preliminary step for the training and classification process, an appropriate choice of input data with regard to the intended application scenario is a decisive aspect. For this purpose, a statistical analysisofmultipledatasetswasconductedaccording to theconceptofExplicit Priors. The resulting data ranges from basic statistics, such as label frequency and the ratio of back- ground to foreground classes, to more sophisticated aspects concerning occurrence distribution and spatial context. For each application scenario, this dataset analysis can be used to select a subset of additional cues for identifyingappropriatedatasets. For thedemonstrated task, for instance, themost useful information was provided by the concept of Location Bins. By dividing the image dimensions into a coarse grid and capturing the spatial distribution of each class across the resulting cells over the entire dataset, probabilities for the occurrence of certain labels with regard to their location can be derived. The resulting representation provides clearly arranged patterns closely related to certain characteristicsof thedataset, suchas themethodof imageacquisition. In thecaseofVehicles, for in- stance, theanalysisclearlyshowedthat images takenwithahand-heldcameraaremostlycenteredon the theseobjects,while for thedatasetsusingacameramountedonacar theyaremostoften found in the lowerhalfof the image. Comparing these statistics for candidate trainingdatasets to the intended applicationscenario facilitates theevaluation of their compatibility. Other available statistical measures proved to add less distinct cues for the given task, such as the analysis of co-occurrence, which provides a measure of probability for each combination of labels to appear in the same image. Since the application scenario only includes five labels arranged within a consecutiveimagesequence, theresultingcorrelationmatrixdidnotshowsignificantpeaks. However, an adapted version in the form of Local Label Neighborhood (LLN), which limits the co-occurance measure to label transitions,was successfully applied, asdescribed indetail inSection3.2. Based on the aggregated information of class frequency and Location Bins, the CamVid dataset [1] could be identified as an appropriate choice for training background classes, since it offers a back- ground ratio of 80.9%, as well as a fitting spatial arrangement of class probabilities. The foreground classes, on the other hand, are trained on the PascalContext dataset [11], in particular the version including33categories, whichcontains46%foreground pixels. ClassifierSetup Basedontheselecteddatasets, twoclassifiersareapplied tocover thebackground and foreground classes separately. The former classifier uses thepre-trained modelpascal-fcn8s-tvg- 29

back to the book Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“"

Page - 29 - in Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“

Image of the Page - 29 -

Text of the Page - 29 -

Table of contents