Seite - 39 - in Proceedings - OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“
Bild der Seite - 39 -
Text der Seite - 39 -
(a) tattoopatches (b) background patches
Figure3: Exampleextractedpatches fromourdataset (patchsize32×32).
4.1. Training thenetwork
For training, we constructed a training set by randomly sampling a number of image patches of
predefined size from each annotated tattoo image in our dataset. This procedure was done both
for positive and negative samples, i.e. for patches that do and do not contain tattoos. Examples of
extractedpatches canbeseen inFig.3.
The training of the network was carried out by optimizing the mean squared error loss function,
using stochastic gradient descent with momentum. We used the mini-batch of 32 samples and the
momentumwasset to0.9. The learningratewasset to0.1. The trainingwasperformedformaximaly
40 epochs, with early stopping based on validation loss. The duration of the training varied greatly
with the size of the patches, in our case from 10 minutes for smallest patches to 13 hours for the
largest.
4.2. Performanceevaluation
The set of all extracted patches totalled 22700 images (11359 positive and 11341 negative samples).
This setwasdivided intosets for training (containing15134samples,outofwhich7573positiveand
7560negative), andvalidationand testing (bothof the samesizeof3783samples, outofwhich1893
positive and 1890 negative). We have ensured that all patches extracted from the same image end up
in only one of the sets (either training, validation or testing), in order to avoid mixing training and
testingdata.
We trained and evaluated the network for different patch sizes (8×8, 12×12, 16×16, 24×24,
32×32 and 48×48) to determine the optimal patch size. The larger patches presumably provide
more informationabout context, but the network thatutilizes themis slower to train and test.
The test set was used for evaluation. The results are summarized in Table 1. The accuracy was
calculated as a total number of misclassifications (false positives and false negatives) divided by the
test set size. Aswecansee, the results improve in termsofaccuracywith the increase in imagepatch
size,uptothelargestconsideredsize(48×48) thatgivesslightlyworseresults thanmostofthesmaller
patch sizes. The difference in accuracy is not very pronounced; i.e. we can say that results for all the
patch sizes are similar. The other thing that can be noticed is that the improvement in performance
with the increase inpatchsizecomesmainlyfromreducing thenumberof falsepositives,whileat the
same time thenumber of falsenegatives rises.
We have done a preliminary qualitative evaluation of the performance of the network in a sliding
window setting. Some results are shown in Fig. 4. These examples are relatively simple, with homo-
39
Proceedings
OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“
- Titel
- Proceedings
- Untertitel
- OAGM & ARW Joint Workshop 2016 on "Computer Vision and Robotics“
- Autoren
- Peter M. Roth
- Kurt Niel
- Verlag
- Verlag der Technischen Universität Graz
- Ort
- Wels
- Datum
- 2017
- Sprache
- englisch
- Lizenz
- CC BY 4.0
- ISBN
- 978-3-85125-527-0
- Abmessungen
- 21.0 x 29.7 cm
- Seiten
- 248
- Schlagwörter
- Tagungsband
- Kategorien
- International
- Tagungsbände