Seite - 118 - in Joint Austrian Computer Vision and Robotics Workshop 2020
Bild der Seite - 118 -
Text der Seite - 118 -
Notation The notation∇σy expresses the convo-
lution of a stack of feature maps ywith the gradi-
entofatwo-dimensionalGaussiandensitywithmean
vector (0,0)T and covariance matrixσI, where I is
the identity matrix. In practice it is a convolution
∇σy= y∗G∇σ with a kernelG∇σ that is normalized
and has shape2×C×CwithC∼ 4σ. The opera-
tion doubles the channels of the tensory. To ensure
fast computation the convolution is implemented as
aconvolutional layer with frozenweights.
MedianFrequencyBalancing A simple and pop-
ular [3, 1] weighting scheme is Median Frequency
Balancing (MFB). Each class (foreground and back-
ground in our binary setting) is assigned a weight to
compensate for imbalance in the frequencyofoccur-
rence. Theweightscaneitherbecomputed individu-
ally foreachsampleoronce for theentiredataset. In
the individual case a foreground/background weight
pair (wf,wb) for a targetmask t isgivenby
wf= N
2 ∑
(i,j)∈D tij and wb= N
2 ∑
(i,j)∈D(1− tij) .
(4)
An example of such a weight map is shown in the
second column of Figure 1. If a single weight pair
for the entire dataset is preferred then it is computed
as the mean ofall sampleweights.
BoundaryProximity Theseparatingboundarybe-
tween foreground and background is the only area
where the segmentation mask is non-constant. Con-
sequently it is also the area where masks generated
by neural networks exhibit the largest mistakes. In
thisapproachpixels inclosevicinity to theboundary
areassignedlargerweights. Suchamethodisalready
suggestedby theauthorsof theoriginalU-Netarchi-
tecture [11], although with a less general approach.
We calculate a weight map based on a pixel’s dis-
tance to the separation boundary using convolution
based edge detection. A large gradient in a segmen-
tationmask indicates thepresenceofanedge. Based
on thiswe define theweightmap
wij=1+c‖(∇σt)ij‖22 .
A map of this type is shown in the third column of
Figure 1. The parameter c is a scaling constant and
is set to5. Gradient Ratio The typical location of segmen-
tation errors can be characterized more concretely.
Photographs are often taken in poor lighting condi-
tions or with cheap camera equipment resulting in
over- or underexposed areas. Common occurrences
are bright reflections in a vehicles roof or dark shad-
ows around its wheelbase (see Figure 2). Both sce-
narios can obscure the precise transition point be-
tween foreground and background. At the data level
we are confronted with image patches that are ei-
ther nearly entirely white or nearly entirely black,
while the same patch in the ground truth segmen-
tation mask contains a binary transition. Motivated
by this observation we claim that the ratio between
change in the mask and change in the corresponding
image is a measure for prediction difficulty and use
it todefineanewweightmap. Againweemploydis-
cretegradients:
wij=1+c ‖(∇σt)ij‖22
‖(∇σx)ij‖22+ .
Asprevious theparameterc isaconstantwhichisset
to0.1and isasmall regularizingconstantwith
NM. The result of the convolution∇σx is a stack
of six feature maps, one for each combination of the
three image channels and the two partial derivatives
in the gradient. A weight map of this type is shown
in the fourthcolumnofFigure1.
ComparativeResults In Table 1 we show results
for the pixelwise losses Mean Squared Error and Bi-
nary Cross-Entropy, first in their default state and
then with the addition of one or more weighting ex-
tensions. To us a pixelwise loss is a function that
sums over the losses of individual pixels. Conse-
quently when the gradient is computed during back-
propagation all terms except the ones belonging to
the individual pixels vanish. We can argue that in
such a loss function no pixel is ignored or treated
lesser.
In direct comparison Binary Cross-Entropy out-
performsMean SquaredError inevery test. Froman
information theoretic point of view it is the natural
loss for binary classification problems. When using
MeanSquaredErrornoneof theproposedweighting
schemesimprovedoveruniformweightswhereas the
opposite holds true for Binary Cross-Entropy where
the best results are achieved using a combination of
MedianFrequencyBalancingandGradientRatio.
118
Joint Austrian Computer Vision and Robotics Workshop 2020
- Titel
- Joint Austrian Computer Vision and Robotics Workshop 2020
- Herausgeber
- Graz University of Technology
- Ort
- Graz
- Datum
- 2020
- Sprache
- englisch
- Lizenz
- CC BY 4.0
- ISBN
- 978-3-85125-752-6
- Abmessungen
- 21.0 x 29.7 cm
- Seiten
- 188
- Kategorien
- Informatik
- Technik