Attention Guided 3D U-Net for KiTS19

Authors
Zhong, Zhusi
Zhang, Zhenxi
Jiao, Zhicheng

Issue Date
2019

Publisher
University of Minnesota Libraries Publishing

Type
Article

Abstract
We use a two-stage 3d U-Net model to predict the multi channels segmentations from coarse to fine. The second stage is guided by the predictions from the first stage. 1 Method We proposed a two stages method to segment CT image from coarse to fine. The two stages are trained with different learning scope and are assigned with different learning missions. 1.1 Stage 1 – Coarse stage Data preprocess. Firstly, we downscale the training data to a normal shape, in order to make sure the model can take a whole image at once. All the images and segmentations are downscale to 128*128*32 (height*width*depth). The segmentation files are transformed to 3-channels arrays, in which the channels-wise pixel values represent kidneys, tumors and the background (without kidneys and tumors) in order. Training. We train the standard 3D U-Net follow with a softmax layer. While training, we apply some data augmentation to the training data, including normalize, random contrast, random flip and random rotate. We input all the 210 cases training data and train the model to regress the multi-channel segmentations. We apply with pytorch, and the learning rate is 0.1 which divide 0.1 in 300000 epochs and 500000 epochs. We use the Binary Cross Entropy Loss as loss function. Predicting. The 90 cases testing images are preprocessed the same with the training images then input to the trained model. The channel-wise predictions are scaled back the original shape. We take the first 2 channels of the predictions, represent as the segmentation of kidneys and tumors, then package as the .nii.gz files.

Identifiers
doi: 10.24926/548719.064