Seite - 30 - in Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“

Bild der Seite - 30 -

Text der Seite - 30 -

dag provided by Zheng et al. [18], which is evaluated on the foreground classes of the PascalCon- text33 dataset. The background classifier was trained using TextonBoost [8] on randomly sampled imagesof theCamViddataset. For thispurpose, featuredescriptorsbasedonfilterbanks, locationand gradientorientationswereapplied for traininga totalof950Textons,whichrepresentsacompromise betweencomputationalcomplexityandaccuracy. Since the testdatasetconsistsofgray-scale images, the learning input is restricted to the intensitychannel. Label Aggregation and Mapping The main obstacle in aggregating multiple datasets during the training stage results from variations in the denomination of object classes. Furthermore, since in many cases not all labels of the training datasets are required for classifying the test images, and multiple labels of one dataset can relate to a single label of another, a generalized mapping strategy is a prerequisite for combining label information. For this purpose, an automatic method for label clustering was developed based on version 3.0 of the Wordnet database [10]. This knowledge rep- resentation was trained exclusively on lexical data and is capable of providing a similarity measure amongsemanticdescriptions. Basedon this, labelsof the trainingdataset canbeassigned to thefinal denominations byapplyinga threshold and givingpreference toclasseswith higher similarity. Figure 2. Label Mapping of CamVid (Columns) to Daimler (Rows) dataset based on Wordnet similiarity (selected labelsaremarked inyellowcolor). In thecaseof theCamViddataset thisprocess resulted inaselectionofeleven labels, asvisualized in Figure2,while the remainingonesarenot required for theapplication taskand therefore suppressed. The selected labels were assigned to the background classes Building, Sky and Ground of the final dataset based on the corresponding similarity. Analogously, the two foreground objects Pedestrian and Vehicle are assigned the PascalContext labels of Pedestrian, Bicyclist, Child and Moving Object, aswell as Car,Motorbike, SUVPickup and Truck, respectively. 3.2. Classification The foreground and background classifiers are applied to each input image of the test set resulting in two complementary segmentations, which are further refined by applying the label mapping method described inSection3.1.This step results inboth imagesbeingsegmented into the labels requiredby the test dataset. In order to further improve the segmentation quality of background classes, the two highest rankedlabelsofeachpixelareretained,aswellas theprobabilitydistancebetweenthem. This information is required for enhancing the results with Local Label Neighborhood priors and further refinementby inferencebasedonaConditionalRandomField (CRF). Local Label Neighborhood The concept of Local Label Neighborhood is based on statistically learning conditional probabilities of transitions between specific labels in vertical and horizontal di- rection. Each annotated pixel within the selected training images is evaluated to capture this prior based on spatial context. For the given task, this results in a measure of probability for each back- ground class to be found on a specific side of either of the two foreground classes. The probabilities 30

zurück zum Buch Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“"

Seite - 30 - in Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“

Bild der Seite - 30 -

Text der Seite - 30 -

Inhaltsverzeichnis