Page - 118 - in Joint Austrian Computer Vision and Robotics Workshop 2020

Image of the Page - 118 -

Text of the Page - 118 -

Notation The notation∇σy expresses the convo- lution of a stack of feature maps ywith the gradi- entofatwo-dimensionalGaussiandensitywithmean vector (0,0)T and covariance matrixσI, where I is the identity matrix. In practice it is a convolution ∇σy= y∗G∇σ with a kernelG∇σ that is normalized and has shape2×C×CwithC∼ 4σ. The opera- tion doubles the channels of the tensory. To ensure fast computation the convolution is implemented as aconvolutional layer with frozenweights. MedianFrequencyBalancing A simple and pop- ular [3, 1] weighting scheme is Median Frequency Balancing (MFB). Each class (foreground and back- ground in our binary setting) is assigned a weight to compensate for imbalance in the frequencyofoccur- rence. Theweightscaneitherbecomputed individu- ally foreachsampleoronce for theentiredataset. In the individual case a foreground/background weight pair (wf,wb) for a targetmask t isgivenby wf= N 2 ∑ (i,j)∈D tij and wb= N 2 ∑ (i,j)∈D(1− tij) . (4) An example of such a weight map is shown in the second column of Figure 1. If a single weight pair for the entire dataset is preferred then it is computed as the mean ofall sampleweights. BoundaryProximity Theseparatingboundarybe- tween foreground and background is the only area where the segmentation mask is non-constant. Con- sequently it is also the area where masks generated by neural networks exhibit the largest mistakes. In thisapproachpixels inclosevicinity to theboundary areassignedlargerweights. Suchamethodisalready suggestedby theauthorsof theoriginalU-Netarchi- tecture [11], although with a less general approach. We calculate a weight map based on a pixel’s dis- tance to the separation boundary using convolution based edge detection. A large gradient in a segmen- tationmask indicates thepresenceofanedge. Based on thiswe define theweightmap wij=1+c‖(∇σt)ij‖22 . A map of this type is shown in the third column of Figure 1. The parameter c is a scaling constant and is set to5. Gradient Ratio The typical location of segmen- tation errors can be characterized more concretely. Photographs are often taken in poor lighting condi- tions or with cheap camera equipment resulting in over- or underexposed areas. Common occurrences are bright reflections in a vehicles roof or dark shad- ows around its wheelbase (see Figure 2). Both sce- narios can obscure the precise transition point be- tween foreground and background. At the data level we are confronted with image patches that are ei- ther nearly entirely white or nearly entirely black, while the same patch in the ground truth segmen- tation mask contains a binary transition. Motivated by this observation we claim that the ratio between change in the mask and change in the corresponding image is a measure for prediction difficulty and use it todefineanewweightmap. Againweemploydis- cretegradients: wij=1+c ‖(∇σt)ij‖22 ‖(∇σx)ij‖22+ . Asprevious theparameterc isaconstantwhichisset to0.1and isasmall regularizingconstantwith NM. The result of the convolution∇σx is a stack of six feature maps, one for each combination of the three image channels and the two partial derivatives in the gradient. A weight map of this type is shown in the fourthcolumnofFigure1. ComparativeResults In Table 1 we show results for the pixelwise losses Mean Squared Error and Bi- nary Cross-Entropy, first in their default state and then with the addition of one or more weighting ex- tensions. To us a pixelwise loss is a function that sums over the losses of individual pixels. Conse- quently when the gradient is computed during back- propagation all terms except the ones belonging to the individual pixels vanish. We can argue that in such a loss function no pixel is ignored or treated lesser. In direct comparison Binary Cross-Entropy out- performsMean SquaredError inevery test. Froman information theoretic point of view it is the natural loss for binary classification problems. When using MeanSquaredErrornoneof theproposedweighting schemesimprovedoveruniformweightswhereas the opposite holds true for Binary Cross-Entropy where the best results are achieved using a combination of MedianFrequencyBalancingandGradientRatio. 118

back to the book Joint Austrian Computer Vision and Robotics Workshop 2020"

Joint Austrian Computer Vision and Robotics Workshop 2020

Title: Joint Austrian Computer Vision and Robotics Workshop 2020
Editor: Graz University of Technology
Location: Graz
Date: 2020
Language: English
License: CC BY 4.0
ISBN: 978-3-85125-752-6
Size: 21.0 x 29.7 cm
Pages: 188
Categories: Informatik; Technik