Web-Books
im Austria-Forum
Austria-Forum
Web-Books
International
Proceedings of the OAGM&ARW Joint Workshop - Vision, Automation and Robotics
Seite - 141 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 141 - in Proceedings of the OAGM&ARW Joint Workshop - Vision, Automation and Robotics

Bild der Seite - 141 -

Bild der Seite - 141 - in Proceedings of the OAGM&ARW Joint Workshop - Vision, Automation and Robotics

Text der Seite - 141 -

Synthetic Image Batch Noise Generator Real Image Batch Discriminator Decision Real / Synthetic Fig. 1. Proposed GAN architecture incorporating the segmentation mask in the real and synthetic image batches are especially useful for biomedical segmentation, as they can provide realistic variations of the input data, similar to natural variations. B. Transfer Learning Transfer learning aims to improve the learning of a target task in a target domain, given the learned knowledge of a source task in a source domain [16]. Applied to neu- ral networks, it describes the process of training a source network on a source dataset, followed by transferring the learned features to train a different target network on a target dataset [28]. In the context of small datasets, this can be applied in different ways. It is possible to train on a large dataset, e.g. ImageNet, remove the final layer of the network architecture and fine-tune to a smaller target dataset [19]. A different approach is taken by using Autoencoders, which compress a given image to a vector representation and reconstruct the image from this compressed representation. As an example, denoising Autoencoders [27] have been used to extract robust features with great success. However, transferring Autoencoder features typically requires a target network architecture very similar to the source architecture, which is rarely the case. C. Image Generation A novel approach to tackle the issue of small datasets for training deep learning methods is to synthesize new training data via image generation methods. Recent research has shown that it is possible to render realistic images using 3D models to alleviate the problem of small datasets [22]. This has the advantage of being able to create an unlimited amount of training data of various scenarios, as long as the images are realistic enough. Rendered images have also recently been used to improve the performance of anatomical landmark detection in medical applications by learning on a dataset of rendered 3D models and fine-tuning on medical data [20]. The disadvantage of using rendered images is that the virtual model and scene parameters need to be explicitly defined and tuned towards the application, which is time consuming. Generative Adversarial Networks [4] represent a different approach to image generation. A generator and a discrimi- nator network are trained to compete against each other. The goal of the discriminator is to decide if any given image is real or synthetic. The generator generates synthetic images in the hope of fooling the discriminator. Since the generator never directly sees the training data and only receives its gradients from the discriminator decision, GANs are also resistant to overfitting [3]. However, the training process of GANs is very sensitive to changes in hyperparameters. The problem of finding the Nash Equilibrium between the generator and the discriminator generally leads to an unstable training process, but recent architectures such as DCGAN [18] and WassersteinGAN [2] improved on this substantially. III. METHOD AND ARCHITECTURE Standard GANs either exclusively learn to generate im- ages [4], or learn to perform image transformations [6]. However, in order to use the generated images for other supervised deep learning tasks, like image segmentation, it is also necessary to have a ground-truth solution for any given input image. We propose a modification to the standard GAN archi- tecture, which forces the generator to create segmentation masks in addition to the generated images. The discriminator then has to decide whether an observed image-segmentation- pair is real or synthetic. This forces both the discriminator and generator to implicitly learn about the structure of the ground-truth, making the resulting generated data useful for training in a supervised setup. While it is known that using ground-truth labels in the discriminator improves the image quality [24], this is the first time, to our knowledge, that the ground-truth is used to directly generate new image- segmentation-pairs. Fig. 1 illustrates this architecture. As the foundation for our proposed architecture, we use the DCGAN [18] architecture, which has shown to achieve good results while having increased training stability in many different applications, compared to the previous GAN architectures. DCGAN uses a convolutional generator and discriminator,makesuseofbatchnormalization,andreplaces 141
zurück zum  Buch Proceedings of the OAGM&ARW Joint Workshop - Vision, Automation and Robotics"
Proceedings of the OAGM&ARW Joint Workshop Vision, Automation and Robotics
Titel
Proceedings of the OAGM&ARW Joint Workshop
Untertitel
Vision, Automation and Robotics
Autoren
Peter M. Roth
Markus Vincze
Wilfried Kubinger
Andreas Müller
Bernhard Blaschitz
Svorad Stolc
Verlag
Verlag der Technischen Universität Graz
Ort
Wien
Datum
2017
Sprache
englisch
Lizenz
CC BY 4.0
ISBN
978-3-85125-524-9
Abmessungen
21.0 x 29.7 cm
Seiten
188
Schlagwörter
Tagungsband
Kategorien
International
Tagungsbände

Inhaltsverzeichnis

  1. Preface v
  2. Workshop Organization vi
  3. Program Committee OAGM vii
  4. Program Committee ARW viii
  5. Awards 2016 ix
  6. Index of Authors x
  7. Keynote Talks
  8. Austrian Robotics Workshop 4
  9. OAGM Workshop 86
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Proceedings of the OAGM&ARW Joint Workshop