Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
Joint Austrian Computer Vision and Robotics Workshop 2020
Page - 112 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 112 - in Joint Austrian Computer Vision and Robotics Workshop 2020

Image of the Page - 112 -

Image of the Page - 112 - in Joint Austrian Computer Vision and Robotics Workshop 2020

Text of the Page - 112 -

Figure 4: View synthesis fromSO(3) transformations unseen during training time. First row: reconstructed Lamp with varying azimuth from -43◦ to 43◦. Second row: reconstructed Glue with elevation variation from - 43◦ to43◦. Rowthree tofive: objectsBenchvise,Camera,Cat reconstructedwithazimuth/elevationrangefrom (-43◦,-43◦) to (43◦,43◦). Object poses outside the green box are samples out of training distribution. Centered images, in the redbox,mark thecanonicalposes. 0 25 50 75 100 125 150 175 Angle in ∘ 0∘02 0∘04 0∘06 0∘08 0∘10 0∘12 0∘14 0∘16 MAE MSE DSSIM Figure 5: Error values and its variance over azimuth angle. The network was trained on its corresponding loss function with a spatial bottleneck dimension of 8×8×128. The vertical line shows the training set range. some of the synthesized views outside of the train- ing range, it is visible that views can be predicted properlybased onSO(3) transformations. Figure 5 provides reconstruction error and vari- ance over an extended azimuth and elevation angle range of [0, 180◦]. The results in the figure are av- eragedoverall objects. The trainingdataset contains images with azimuth angles up to 37◦. A sharp rise in error and variance is observed at azimuth angle of approximately45◦. Foranglesabovethisvalue,error and variance increase rapidly. As such, the network cannotproperly reconstruct theseviews. These results show that our formulation for creat- ing equivariant feature spaces has the desired prop- erty to correlate spatial transformations with 2D views of the transformed object. Thus, the pro- posed Trilinear interpolation layer guides the net- work towards learning an equivariant feature space inSO(3). 5.Conclusion We extend recent work for learning equivari- ant feature spaces for synthesizing object views in SO(3). Theproposedextensionof theSpatialTrans- form Network [6], that we call the Trilinear interpo- lation Layer, appliesSO(3) transformations to fea- ture maps from 2D data. Validity of the approach is provided by training a simple encoder-decoder net- work architecture. Our experiments show that our formulation not only enables the prediction of views unseen during training time but also in a small range outside. Thecurrent formulationenablescontrol for5DoF, SO(3)and translations in image space. Future work will tackle adapting the proposed layer to create ob- ject view synthesis in all ofSE(3). We then plan to integratethis inaposerefinementstrategytoimprove objectposeestimation. 112
back to the  book Joint Austrian Computer Vision and Robotics Workshop 2020"
Joint Austrian Computer Vision and Robotics Workshop 2020
Title
Joint Austrian Computer Vision and Robotics Workshop 2020
Editor
Graz University of Technology
Location
Graz
Date
2020
Language
English
License
CC BY 4.0
ISBN
978-3-85125-752-6
Size
21.0 x 29.7 cm
Pages
188
Categories
Informatik
Technik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
Joint Austrian Computer Vision and Robotics Workshop 2020