Seite - 101 - in Joint Austrian Computer Vision and Robotics Workshop 2020
Bild der Seite - 101 -
Text der Seite - 101 -
Figure 4: The problem of mul-
tiple detections. Ground truth
is shown in green. Left: state-
of-the-art yields two bounding
boxesofthesame,singleperson.
Middle: two persons are visible.
Detection yields two bounding
boxes which are diffucult to as-
sociate. Right: an even harder
casewith three persons.
Figure 5: This ROC plot shows results of Faster R-
CNN(green),YOLO(blue),MaskR-CNN(red)and
ourmethod (purple) for all occlusion levels.
but also on the evaluation and on the detector which
we leave open for future research.
5.Conclusion
Thispaper formulatesanewscientificquestionon
object detection with fragmented occlusion which is
different to partial occlusion. We show by a study
that current object detectors fail in this case. We
generatedand labelledanewdataset showingpeople
behind trees in a forestry environment. Such scenes
frequentlyoccur inbordersurveillancewhichhasbe-
come very important in EU security policies. We try
to tackle the occlusion challenge by augmenting Mi-
crosoftCOCOincludingthepixel-wisesegmentation
masks to capture the occlusion problem. We show
that Mask R-CNN trained on this data improves on
fragmented occlusion, however, we also observe se-
vere loss of spatial, structural information and that
the bounding box itself is not the appropriate de-
scription to cope with fragmented occlusion. This has severe implications on the detection approach it-
self, but also on dataset labelling and evaluation. A
potential solution is left open for futurework.
Acknowledgments
This research was supported by the European
Union H2020 programme under grant agreement
FOLDOUT-787021. We thank all our students on
internship to label thenewdataset.
References
[1] M. Black and P. Anandan. The robust estimation of
multiplemotions: Parametricandpiecewise-smooth
flow fields. Computer Vision and Image Under-
standing, 63:75–104,011996.
[2] R.Girshick. Fast r-cnn. InCVPR,pages1440–1448,
2015.
[3] R. Girshick, J. Donahue, T. Darrell, and J. Malik.
Richfeaturehierarchiesforaccurateobjectdetection
and semantic segmentation. InCVPR, pages 580–
587,2014.
[4] K.He,G.Gkioxari,P.Dolla´r,andR.Girshick. Mask
r-cnn. InCVPR, pages2961–2969,2017.
[5] S. Ioffe and C. Szegedy. Batch normaliza-
tion: Accelerating deep network training by re-
ducing internal covariate shift. arXiv preprint
arXiv:1502.03167, 2015.
[6] W.Liu,D.Anguelov,D.Erhan,C.Szegedy,S.Reed,
C.-Y.Fu, andA.C.Berg. Ssd: Single shotmultibox
detector. InECCV, pages21–37.Springer, 2016.
[7] G. Nebehay and R. Pflugfelder. Clustering of
static-adaptive correspondences for deformable ob-
ject tracking. InCVPR, June2015.
[8] J. Redmon and A. Farhadi. Yolov3: An incremental
improvement. arXiv, 2018.
[9] S. Ren, K. He, R. Girshick, and J. Sun. Faster r-
cnn: Towards real-time object detection with region
proposalnetworks. InNIPS, pages 91–99,2015.
[10] S.Ullman,L.Assif,E.Fetaya,andD.Harari. Atoms
ofrecognitioninhumanandcomputervision.PNAS,
113(10):2744–2749,2016.
101
Joint Austrian Computer Vision and Robotics Workshop 2020
- Titel
- Joint Austrian Computer Vision and Robotics Workshop 2020
- Herausgeber
- Graz University of Technology
- Ort
- Graz
- Datum
- 2020
- Sprache
- englisch
- Lizenz
- CC BY 4.0
- ISBN
- 978-3-85125-752-6
- Abmessungen
- 21.0 x 29.7 cm
- Seiten
- 188
- Kategorien
- Informatik
- Technik