Page - 39 - in Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“

Image of the Page - 39 -

Text of the Page - 39 -

(a) tattoopatches (b) background patches Figure3: Exampleextractedpatches fromourdataset (patchsize32×32). 4.1. Training thenetwork For training, we constructed a training set by randomly sampling a number of image patches of predefined size from each annotated tattoo image in our dataset. This procedure was done both for positive and negative samples, i.e. for patches that do and do not contain tattoos. Examples of extractedpatches canbeseen inFig.3. The training of the network was carried out by optimizing the mean squared error loss function, using stochastic gradient descent with momentum. We used the mini-batch of 32 samples and the momentumwasset to0.9. The learningratewasset to0.1. The trainingwasperformedformaximaly 40 epochs, with early stopping based on validation loss. The duration of the training varied greatly with the size of the patches, in our case from 10 minutes for smallest patches to 13 hours for the largest. 4.2. Performanceevaluation The set of all extracted patches totalled 22700 images (11359 positive and 11341 negative samples). This setwasdivided intosets for training (containing15134samples,outofwhich7573positiveand 7560negative), andvalidationand testing (bothof the samesizeof3783samples, outofwhich1893 positive and 1890 negative). We have ensured that all patches extracted from the same image end up in only one of the sets (either training, validation or testing), in order to avoid mixing training and testingdata. We trained and evaluated the network for different patch sizes (8×8, 12×12, 16×16, 24×24, 32×32 and 48×48) to determine the optimal patch size. The larger patches presumably provide more informationabout context, but the network thatutilizes themis slower to train and test. The test set was used for evaluation. The results are summarized in Table 1. The accuracy was calculated as a total number of misclassifications (false positives and false negatives) divided by the test set size. Aswecansee, the results improve in termsofaccuracywith the increase in imagepatch size,uptothelargestconsideredsize(48×48) thatgivesslightlyworseresults thanmostofthesmaller patch sizes. The difference in accuracy is not very pronounced; i.e. we can say that results for all the patch sizes are similar. The other thing that can be noticed is that the improvement in performance with the increase inpatchsizecomesmainlyfromreducing thenumberof falsepositives,whileat the same time thenumber of falsenegatives rises. We have done a preliminary qualitative evaluation of the performance of the network in a sliding window setting. Some results are shown in Fig. 4. These examples are relatively simple, with homo- 39

back to the book Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“"

Proceedings OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“

Title: Proceedings
Subtitle: OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“
Authors: Peter M. Roth; Kurt Niel
Publisher: Verlag der Technischen Universität Graz
Location: Wels
Date: 2017
Language: English
License: CC BY 4.0
ISBN: 978-3-85125-527-0
Size: 21.0 x 29.7 cm
Pages: 248
Keywords: Tagungsband
Categories: International; Tagungsbände

Page - 39 - in Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“

Image of the Page - 39 -

Text of the Page - 39 -

Table of contents