Page - 75 - in Joint Austrian Computer Vision and Robotics Workshop 2020

Image of the Page - 75 -

Text of the Page - 75 -

Method PQ SQ RQ PQTh SQTh RQTh PQSt SQSt RQSt Semantic+ Instance 40.6 70.9 51.3 40.3 75.4 53.0 40.9 67.6 50.0 Panoptic FPN 41.9 73.7 53.4 43.0 75.2 56.6 41.2 72.5 51.1 HPS 42.9 74.5 54.3 43.4 75.7 56.7 42.6 73.6 52.5 HPS+ISI 44.0 74.8 55.5 44.4 76.4 57.5 43.7 73.6 54.1 Table1: QuantitativeresultsontheCityscapesdataset. Theresultsshowthatasharedfeaturebackbonereduces overfitting compared to two disjoint networks (Semantic + Instance vs Panoptic FPN). Also, generating the finalpanopticoutput internallyandtrainingthesystemend-to-endincreases theperformance(PanopticFPN vs HPS).Finally,using inter-task relations in the formofan initial segmentation image(ISI)providesaneffective segmentationprior and increases theoverall panopticqualityaswell as all othermetrics (HPS vsHPS+ISI). stancesegmentationbranchesbyusinganinitial seg- mentation image (ISI), as introduced inSec.3.2. 4.2.Results The thus obtained results of the four methods de- scribed above on the Cityscapes dataset are summa- rized in Table 1. In addition, to the panoptic quality (PQ),weshowthesegmentationquality(SQ)andthe recognition quality (RQ) for all classes, things (Th) classes only, and stuff (St) classes only. Since PQ is a measurement of semantic (SQ) and instance (RQ) segmentation quality an improvement in either part will increase theaccuracyof theoverall system. Interestingly,Semantic+Instanceperformsworse than Panoptic FPN. We hypothesize that this is be- cause thenumberof training images inCityscapes is low. Thus, the shared feature backbone of Panoptic FPN acts as a regularizer which reduces overfitting compared to training two individual networks with- out shared featureson thisdataset. Next, HPS improves upon Panoptic FPN across all metrics and classes, because we optimize for the finalpanoptic segmentationoutput. Oursystemmin- imizesapanoptic loss inadditionto thesemanticand instance segmentation losses which provides better guidancefor thenetwork. In thisway,wedonot rely on the heuristic merging of subtask predictions but directly generate the desired output internally which results in improvedaccuracy inpractice. Finally, HPS + ISI significantly outperforms all othermethodsbecauseitadditionallyleveragesinter- task relations. Compared to Panoptic FPN, HPS + ISI improves PQ by+5% relative from41.9 to44.0. Providing instance segmentation predictions as ad- ditional feature input for the semantic segmentation branchgivesasegmentationprior. Byexploiting this prior, the semantic segmentation branch can focus more on the prediction of stuff classes and bound- aries between individual classes which results in im- provedaccuracyacrossallmetrics. Additionally,our architectural advances only add a neglible computa- tional overhead during both training and inference compared toPanopticFPN. This quantitative improvement is also reflected qualitatively, as shown in Figure 4. We observe that HPS + ISI handles occlusions more accurately (1st row) and resolves overlapping issues on its own whilebeinglesssensitivetospecklenoiseinsemanti- cally coherent regions (2nd row). Thanks to our end- to-end training and inter-task relations, we predict more accurate semantic label transitions (3rd row) and reduce confusion between classes with similar semanticmeaning likebus andcar (4th row). 5.Conclusion Panoptic segmentation isachallengingbut impor- tant and practically highly relevant problem. As ap- proaching panoptic segmentation by independently addressing semantic and instance segmentation has several limitations, we propose a single end-to-end trainablenetworkarchitecture thatdirectlyoptimizes for the final objective. Moreover, we present a way to share mutual information between the tasks by providing instance segmentation predictions as ad- ditional feature input for our semantic segmentation branch. This inter-task link allows us to exploit a segmentationpriorandimprovestheoverallpanoptic quality. In this way, our work is a first step towards fullyentangledpanoptic segmentation. Acknowledgment. This work was partially sup- ported by the Christian Doppler Laboratory for Se- mantic3DComputerVision, funded inpartbyQual- commInc. 75

back to the book Joint Austrian Computer Vision and Robotics Workshop 2020"

Joint Austrian Computer Vision and Robotics Workshop 2020

Title: Joint Austrian Computer Vision and Robotics Workshop 2020
Editor: Graz University of Technology
Location: Graz
Date: 2020
Language: English
License: CC BY 4.0
ISBN: 978-3-85125-752-6
Size: 21.0 x 29.7 cm
Pages: 188
Categories: Informatik; Technik