Seite - 105 - in Joint Austrian Computer Vision and Robotics Workshop 2020
Bild der Seite - 105 -
Text der Seite - 105 -
between themodels, theentirepreprocessing isdone
separately for each model. First of all, the dataset
is resampled to the respective voxel spacing using
a third order B-spline interpolation for the scans
and a label-linear interpolation for the ground truth.
Next, the intensityvaluesareclipped to the0.5thand
99.5th percentile over the entire training dataset of
the fold. Furthermore, the scans are normalized by
subtracting themeanand thestandarddeviationover
the clipped trainingdataset.
5.2.Architecture
We use the architecture described by
Isensee et al. [9] and implemented in the Github
project 3DUnetCNN [5] as a basis for our experi-
ments. Weadjusted the followingmodelparameters:
input size, model-depth (number of layers), number
of segmentation levels (used for deep supervision)
and base-filters (filters in the first convolution
kernel). ForM1 (input size of 192× 192× 128)
we selected a model-depth of 5 with 3 segmentation
levels and base-filters set to 8. ForM2 on the other
hand (input size of 160× 160× 128), we chose
an increased model-depth of 6 with 4 segmentation
levels and base-filters set to 16. The changes toM2
were made in order to account for the larger patch
size (compared to 1283 used by Isensee et al.) and
increase the receptive field of the model. These
changes were omitted forM1, which encompasses
a simpler segmentation task, creating only a coarse
segmentation of the blood lumen label, whileM2
segments both the blood lumen and the stent-graft
wire frame.
5.3.Training
We trained both models using a weighted multi-
classDice loss [9] incombinationwithanAdamop-
timizer. The initial learning rate was set to η0 =
5 ·10−4 with a learning rate drop criterion and early
stopping after 50 epochs. The training ran for 70
to 120 epochs with 200 training samples per epoch.
Duetothe5-foldcrossvalidationusedforevaluation,
the following statistics are averaged over all folds,
where for each fold both modelsM1 andM2 were
trained as follows. M1 was trained first for blood
lumen segmentation on the low resolution large re-
gions. The training reached a DSC of 0.978 and
0.898, on average, for the training and validation
items, respectively. M1 was then used to create the
blood lumensegmentations forcenterlineextraction.
The resulting centerline graphs were subsequently usedduring the trainingofM2 as thehigh resolution
patcheswereextractedat randompositionsalong the
graph. The average training and validation DSCs for
the blood lumen are 0.954 and 0.943, respectively,
and0.843and0.841 for the stent-graft.
6.Evaluation
Having trained two modelsM1 andM2 for each
fold, we use our method to create high resolution
segmentations. Just like during training,M1 is used
to segment the blood lumen used for centerline
extraction. The resulting centerline graph is again
used to place patches at, however, not randomly
but rather at equally distributed positions along the
entire span of the graph, as described in Section
4.2. In a post-processing step, the largest connected
region of non-background voxels was selected.
To compare the results to the ground truth, the
segmentations where furthermore resampled to their
originalvoxel-spacing. The last stepmaybeskipped
when using the results for further processing rather
thanevaluation(e.g.,meshgenerationforblood-flow
simulations). Using our method, the cross validation
yields an average DSC of0.961 for the blood lumen
and 0.841 for the stent-graft label. Two examples
are shown inFigure5.
Toevaluate theeffectivenessofourpatchextraction
method, we further conducted an experiment using
onlyM2,whichwas trainedusinga traditionalpatch
extraction method (see Isensee et al. [10]). Rather
than placing the patches along the aorta centerlines,
they where placed in a sliding-window fashion,
where the patches are aligned in a regular grid of
overlapping tiles. The overlap was set to 32 voxels
in each dimension (corresponding to 11.2mm
frontal/sagittal and 24mm longitudinal). While
this technique was used both during training and
inference, the remaining setup (including pre- and
post-processing) was left unchanged. We evaluated
(a) (b)
Figure 5. Evaluation results for the two scans shown in
Figure3.
105
Joint Austrian Computer Vision and Robotics Workshop 2020
- Titel
- Joint Austrian Computer Vision and Robotics Workshop 2020
- Herausgeber
- Graz University of Technology
- Ort
- Graz
- Datum
- 2020
- Sprache
- englisch
- Lizenz
- CC BY 4.0
- ISBN
- 978-3-85125-752-6
- Abmessungen
- 21.0 x 29.7 cm
- Seiten
- 188
- Kategorien
- Informatik
- Technik