Page - 144 - in Proceedings of the OAGM&ARW Joint Workshop - Vision, Automation and Robotics

Image of the Page - 144 -

Text of the Page - 144 -

Fig. 3. Sample images and segmentation masks generated by the GAN trained on the full dataset if the training is stopped too early is trained without generating segmentation masks. We also experience a mild form of Mode Collapse [3], as some of the generated images look very similar. While the images obtained by the fully trained GAN shown in Fig. 2 have a high quality, Fig. 3 illustrates that, if the training time for the GAN is too short, generated images are unusable for later supervised training, as the image quality is too low. Finding a suitable stopping point for GAN training is still a hot topic of current research, as a lower GAN loss during training typically does not indicate higher image quality of the generated images. However, recent modifications to the GAN learning process show that it is possible to correlate the GAN loss with image quality [2], which enables the possibility of stopping the GAN training once the loss is under a certain threshold. The resultsof thequantitativeevaluationon the fulldataset shown in Table I indicate that the GAN images are not sufficient to replace the real images in this case. Using a combination of real and synthetic images to train our segmentation network, the Dice score and Hausdorff distance results are comparable to the results obtained by training on real images only. When only synthetic images obtained by the GAN are used to train the segmentation network, the performance is worse. For the reduced dataset evaluation, the results shown in Table II are not as conclusive. The network with the best Dice score was trained exclusively on real images, while the network with the lowest Hausdorff distance was trained on a combination of real and synthetic images. A very interesting point, however, is that for the reduceddataset, thenetwork trainedexclusivelyongenerated GANimagesperformedalmostaswellas thenetwork trained on real images, showing significant potential of GANs for training data generation. It is also worth mentioning that the U-Net trained exclusively on generated GAN images from the reduced dataset performed better than the U-Net trained exclusively on generated GAN images from the full dataset. We suspect that this is because the GAN has an easier time to converge to generating high quality images for the reduced dataset compared to the full dataset, leading to better image quality of the generated images. The quantitative results still have room for improvement. As a further outlook, it would be interesting to incorporate data augmentation in the GAN by using elastic deformations to induce variance in the GAN’s training data, which may potentially lead to a greater variety of generated GAN images.Overall,wedemonstrated thatGANshavesignificant potential for synthesis of medical training data for supervised tasks by learning to generate segmentation masks in addition to artificial image data. REFERENCES [1] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: A System for Large-scale Machine Learning,” in Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, ser. OSDI’16. Berkeley, CA, USA: USENIX Association, 2016, pp. 265–283. [Online]. Available: http://dl.acm.org/citation.cfm?id=3026877.3026899 [2] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein GAN,” ArXiv e-prints, Jan. 2017. [3] I. Goodfellow, “NIPS 2016 Tutorial: Generative Adversarial Net- works,” ArXiv e-prints, Dec. 2016. [4] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative Adversarial Nets,” inAdvances inneural informationprocessingsystems, 2014,pp.2672– 2680. [5] K. He, X. Zhang, S. Ren, and J. Sun, “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification,” in Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), ser. ICCV ’15. Washington, DC, USA: IEEE Computer Society, 2015, pp. 1026–1034. [Online]. Available: http://dx.doi.org/10.1109/ICCV.2015.123 [6] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-Image Translation with Conditional Adversarial Networks,” ArXiv e-prints, Nov. 2016. [7] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutional Architecture for Fast Feature Embedding,” in Proceedings of the 22Nd ACM International Conference on Multimedia, ser. MM ’14. New York, NY, USA: ACM, 2014, pp. 675–678. [Online]. Available: http://doi.acm.org/10.1145/2647868.2654889 [8] K. Kamnitsas, C. Baumgartner, C. Ledig, V. F. Newcombe, J. P. Simpson, A. D. Kane, D. K. Menon, A. Nori, A. Criminisi, D. Rueckert, and B. Glocker, “Unsupervised domain adaptation in brain lesion segmentation with adversarial networks,” in Information Processing in Medical Imaging (IPMI), June 2017. [Online]. Available: https://www.microsoft.com/en- us/research/publication/unsupervised-domain-adaptation-brain-lesion- segmentation-adversarial-networks/ [9] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” in International Conference on Learning Representations, vol. abs/1412.6980, 2015. [Online]. Available: http://arxiv.org/abs/1412.6980 [10] A. Krizhevsky and G. Hinton, “Learning multiple layers of features from tiny images,” 2009. [11] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds. Curran Associates, Inc., 2012, pp. 1097– 1105. [Online]. Available: http://papers.nips.cc/paper/4824-imagenet- classification-with-deep-convolutional-neural-networks.pdf [12] Y. LeCun, C. Cortes, and C. J. Burges, “The MNIST database of handwritten digits,” 1998. [13] A. M. Mharib, A. R. Ramli, S. Mashohor, and R. B. Mahmood, “Survey on liver ct image segmentation methods,” Artificial Intelligence Review, vol. 37, no. 2, pp. 83–95, 2012. [Online]. Available: http://dx.doi.org/10.1007/s10462-011-9220-3 [14] Y. Nesterov, “A method of solving a convex programming problem with convergence rate O (1/k2),” in Soviet Mathematics Doklady, vol. 27, 1983, pp. 372–376. [15] M. A. Nielsen, Neural Networks and Deep Learning. Determination Press, 2015, http://neuralnetworksanddeeplearning.com. [16] S. J. Pan and Q. Yang, “A Survey on Transfer Learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 10, pp. 1345–1359, Oct 2010. 144

back to the book Proceedings of the OAGM&ARW Joint Workshop - Vision, Automation and Robotics"

Proceedings of the OAGM&ARW Joint Workshop Vision, Automation and Robotics

Title: Proceedings of the OAGM&ARW Joint Workshop
Subtitle: Vision, Automation and Robotics
Authors: Peter M. Roth; Markus Vincze; Wilfried Kubinger; Andreas Müller; Bernhard Blaschitz; Svorad Stolc
Publisher: Verlag der Technischen Universität Graz
Location: Wien
Date: 2017
Language: English
License: CC BY 4.0
ISBN: 978-3-85125-524-9
Size: 21.0 x 29.7 cm
Pages: 188
Keywords: Tagungsband
Categories: International; Tagungsbände

Page - 144 - in Proceedings of the OAGM&ARW Joint Workshop - Vision, Automation and Robotics

Image of the Page - 144 -

Text of the Page - 144 -

Table of contents