Page - 144 - in Proceedings of the OAGM&ARW Joint Workshop - Vision, Automation and Robotics
Image of the Page - 144 -
Text of the Page - 144 -
Fig. 3. Sample images and segmentation masks generated by the GAN
trained on the full dataset if the training is stopped too early
is trained without generating segmentation masks. We also
experience a mild form of Mode Collapse [3], as some of
the generated images look very similar. While the images
obtained by the fully trained GAN shown in Fig. 2 have a
high quality, Fig. 3 illustrates that, if the training time for
the GAN is too short, generated images are unusable for
later supervised training, as the image quality is too low.
Finding a suitable stopping point for GAN training is still a
hot topic of current research, as a lower GAN loss during
training typically does not indicate higher image quality of
the generated images. However, recent modifications to the
GAN learning process show that it is possible to correlate
the GAN loss with image quality [2], which enables the
possibility of stopping the GAN training once the loss is
under a certain threshold.
The resultsof thequantitativeevaluationon the fulldataset
shown in Table I indicate that the GAN images are not
sufficient to replace the real images in this case. Using
a combination of real and synthetic images to train our
segmentation network, the Dice score and Hausdorff distance
results are comparable to the results obtained by training on
real images only. When only synthetic images obtained by
the GAN are used to train the segmentation network, the
performance is worse. For the reduced dataset evaluation,
the results shown in Table II are not as conclusive. The
network with the best Dice score was trained exclusively
on real images, while the network with the lowest Hausdorff
distance was trained on a combination of real and synthetic
images. A very interesting point, however, is that for the
reduceddataset, thenetwork trainedexclusivelyongenerated
GANimagesperformedalmostaswellas thenetwork trained
on real images, showing significant potential of GANs for
training data generation. It is also worth mentioning that the
U-Net trained exclusively on generated GAN images from
the reduced dataset performed better than the U-Net trained
exclusively on generated GAN images from the full dataset.
We suspect that this is because the GAN has an easier time
to converge to generating high quality images for the reduced
dataset compared to the full dataset, leading to better image
quality of the generated images.
The quantitative results still have room for improvement.
As a further outlook, it would be interesting to incorporate
data augmentation in the GAN by using elastic deformations
to induce variance in the GAN’s training data, which may potentially lead to a greater variety of generated GAN
images.Overall,wedemonstrated thatGANshavesignificant
potential for synthesis of medical training data for supervised
tasks by learning to generate segmentation masks in addition
to artificial image data.
REFERENCES
[1] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin,
S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg,
R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker,
V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng,
“TensorFlow: A System for Large-scale Machine Learning,” in
Proceedings of the 12th USENIX Conference on Operating Systems
Design and Implementation, ser. OSDI’16. Berkeley, CA, USA:
USENIX Association, 2016, pp. 265–283. [Online]. Available:
http://dl.acm.org/citation.cfm?id=3026877.3026899
[2] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein GAN,” ArXiv
e-prints, Jan. 2017.
[3] I. Goodfellow, “NIPS 2016 Tutorial: Generative Adversarial Net-
works,” ArXiv e-prints, Dec. 2016.
[4] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,
S. Ozair, A. Courville, and Y. Bengio, “Generative Adversarial Nets,”
inAdvances inneural informationprocessingsystems, 2014,pp.2672–
2680.
[5] K. He, X. Zhang, S. Ren, and J. Sun, “Delving Deep into Rectifiers:
Surpassing Human-Level Performance on ImageNet Classification,”
in Proceedings of the 2015 IEEE International Conference on
Computer Vision (ICCV), ser. ICCV ’15. Washington, DC, USA:
IEEE Computer Society, 2015, pp. 1026–1034. [Online]. Available:
http://dx.doi.org/10.1109/ICCV.2015.123
[6] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-Image
Translation with Conditional Adversarial Networks,” ArXiv e-prints,
Nov. 2016.
[7] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick,
S. Guadarrama, and T. Darrell, “Caffe: Convolutional Architecture
for Fast Feature Embedding,” in Proceedings of the 22Nd ACM
International Conference on Multimedia, ser. MM ’14. New
York, NY, USA: ACM, 2014, pp. 675–678. [Online]. Available:
http://doi.acm.org/10.1145/2647868.2654889
[8] K. Kamnitsas, C. Baumgartner, C. Ledig, V. F. Newcombe,
J. P. Simpson, A. D. Kane, D. K. Menon, A. Nori,
A. Criminisi, D. Rueckert, and B. Glocker, “Unsupervised
domain adaptation in brain lesion segmentation with adversarial
networks,” in Information Processing in Medical Imaging (IPMI),
June 2017. [Online]. Available: https://www.microsoft.com/en-
us/research/publication/unsupervised-domain-adaptation-brain-lesion-
segmentation-adversarial-networks/
[9] D. P. Kingma and J. Ba, “Adam: A Method for
Stochastic Optimization,” in International Conference on Learning
Representations, vol. abs/1412.6980, 2015. [Online]. Available:
http://arxiv.org/abs/1412.6980
[10] A. Krizhevsky and G. Hinton, “Learning multiple layers of features
from tiny images,” 2009.
[11] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet
Classification with Deep Convolutional Neural Networks,” in
Advances in Neural Information Processing Systems 25,
F. Pereira, C. J. C. Burges, L. Bottou, and K. Q.
Weinberger, Eds. Curran Associates, Inc., 2012, pp. 1097–
1105. [Online]. Available: http://papers.nips.cc/paper/4824-imagenet-
classification-with-deep-convolutional-neural-networks.pdf
[12] Y. LeCun, C. Cortes, and C. J. Burges, “The MNIST database of
handwritten digits,” 1998.
[13] A. M. Mharib, A. R. Ramli, S. Mashohor, and R. B. Mahmood,
“Survey on liver ct image segmentation methods,” Artificial
Intelligence Review, vol. 37, no. 2, pp. 83–95, 2012. [Online].
Available: http://dx.doi.org/10.1007/s10462-011-9220-3
[14] Y. Nesterov, “A method of solving a convex programming problem
with convergence rate O (1/k2),” in Soviet Mathematics Doklady,
vol. 27, 1983, pp. 372–376.
[15] M. A. Nielsen, Neural Networks and Deep Learning. Determination
Press, 2015, http://neuralnetworksanddeeplearning.com.
[16] S. J. Pan and Q. Yang, “A Survey on Transfer Learning,” IEEE
Transactions on Knowledge and Data Engineering, vol. 22, no. 10,
pp. 1345–1359, Oct 2010.
144
Proceedings of the OAGM&ARW Joint Workshop
Vision, Automation and Robotics
- Title
- Proceedings of the OAGM&ARW Joint Workshop
- Subtitle
- Vision, Automation and Robotics
- Authors
- Peter M. Roth
- Markus Vincze
- Wilfried Kubinger
- Andreas MĂĽller
- Bernhard Blaschitz
- Svorad Stolc
- Publisher
- Verlag der Technischen Universität Graz
- Location
- Wien
- Date
- 2017
- Language
- English
- License
- CC BY 4.0
- ISBN
- 978-3-85125-524-9
- Size
- 21.0 x 29.7 cm
- Pages
- 188
- Keywords
- Tagungsband
- Categories
- International
- Tagungsbände