Page - 75 - in Joint Austrian Computer Vision and Robotics Workshop 2020
Image of the Page - 75 -
Text of the Page - 75 -
Method PQ SQ RQ PQTh SQTh RQTh PQSt SQSt RQSt
Semantic+ Instance 40.6 70.9 51.3 40.3 75.4 53.0 40.9 67.6 50.0
Panoptic FPN 41.9 73.7 53.4 43.0 75.2 56.6 41.2 72.5 51.1
HPS 42.9 74.5 54.3 43.4 75.7 56.7 42.6 73.6 52.5
HPS+ISI 44.0 74.8 55.5 44.4 76.4 57.5 43.7 73.6 54.1
Table1: QuantitativeresultsontheCityscapesdataset. Theresultsshowthatasharedfeaturebackbonereduces
overfitting compared to two disjoint networks (Semantic + Instance vs Panoptic FPN). Also, generating the
finalpanopticoutput internallyandtrainingthesystemend-to-endincreases theperformance(PanopticFPN vs
HPS).Finally,using inter-task relations in the formofan initial segmentation image(ISI)providesaneffective
segmentationprior and increases theoverall panopticqualityaswell as all othermetrics (HPS vsHPS+ISI).
stancesegmentationbranchesbyusinganinitial seg-
mentation image (ISI), as introduced inSec.3.2.
4.2.Results
The thus obtained results of the four methods de-
scribed above on the Cityscapes dataset are summa-
rized in Table 1. In addition, to the panoptic quality
(PQ),weshowthesegmentationquality(SQ)andthe
recognition quality (RQ) for all classes, things (Th)
classes only, and stuff (St) classes only. Since PQ is
a measurement of semantic (SQ) and instance (RQ)
segmentation quality an improvement in either part
will increase theaccuracyof theoverall system.
Interestingly,Semantic+Instanceperformsworse
than Panoptic FPN. We hypothesize that this is be-
cause thenumberof training images inCityscapes is
low. Thus, the shared feature backbone of Panoptic
FPN acts as a regularizer which reduces overfitting
compared to training two individual networks with-
out shared featureson thisdataset.
Next, HPS improves upon Panoptic FPN across
all metrics and classes, because we optimize for the
finalpanoptic segmentationoutput. Oursystemmin-
imizesapanoptic loss inadditionto thesemanticand
instance segmentation losses which provides better
guidancefor thenetwork. In thisway,wedonot rely
on the heuristic merging of subtask predictions but
directly generate the desired output internally which
results in improvedaccuracy inpractice.
Finally, HPS + ISI significantly outperforms all
othermethodsbecauseitadditionallyleveragesinter-
task relations. Compared to Panoptic FPN, HPS +
ISI improves PQ by+5% relative from41.9 to44.0.
Providing instance segmentation predictions as ad-
ditional feature input for the semantic segmentation
branchgivesasegmentationprior. Byexploiting this
prior, the semantic segmentation branch can focus more on the prediction of stuff classes and bound-
aries between individual classes which results in im-
provedaccuracyacrossallmetrics. Additionally,our
architectural advances only add a neglible computa-
tional overhead during both training and inference
compared toPanopticFPN.
This quantitative improvement is also reflected
qualitatively, as shown in Figure 4. We observe
that HPS + ISI handles occlusions more accurately
(1st row) and resolves overlapping issues on its own
whilebeinglesssensitivetospecklenoiseinsemanti-
cally coherent regions (2nd row). Thanks to our end-
to-end training and inter-task relations, we predict
more accurate semantic label transitions (3rd row)
and reduce confusion between classes with similar
semanticmeaning likebus andcar (4th row).
5.Conclusion
Panoptic segmentation isachallengingbut impor-
tant and practically highly relevant problem. As ap-
proaching panoptic segmentation by independently
addressing semantic and instance segmentation has
several limitations, we propose a single end-to-end
trainablenetworkarchitecture thatdirectlyoptimizes
for the final objective. Moreover, we present a way
to share mutual information between the tasks by
providing instance segmentation predictions as ad-
ditional feature input for our semantic segmentation
branch. This inter-task link allows us to exploit a
segmentationpriorandimprovestheoverallpanoptic
quality. In this way, our work is a first step towards
fullyentangledpanoptic segmentation.
Acknowledgment. This work was partially sup-
ported by the Christian Doppler Laboratory for Se-
mantic3DComputerVision, funded inpartbyQual-
commInc.
75
Joint Austrian Computer Vision and Robotics Workshop 2020
- Title
- Joint Austrian Computer Vision and Robotics Workshop 2020
- Editor
- Graz University of Technology
- Location
- Graz
- Date
- 2020
- Language
- English
- License
- CC BY 4.0
- ISBN
- 978-3-85125-752-6
- Size
- 21.0 x 29.7 cm
- Pages
- 188
- Categories
- Informatik
- Technik