Abstract:
Recent state-of-the-art deep learning frameworks require large, fully annotated training datasets that are, depending on the objective, time-consuming to generate.
While in most fields, these labelling tasks can be parallelized massively or even outsourced, this is not the case for medical images.
Usually, only a highly trained expert is able to generate these datasets.
However, since additional manual annotation, especially for the purpose of segmentation or tracking, is typically not part of a radiologist's workflow, large and fully annotated datasets are a rare and scarce good.
In this context, a variety of frameworks are proposed in this work to solve the problems that arise due to the lack of annotated training data across different medical imaging tasks and modalities.
The first contribution as part of this thesis was to investigate weakly supervised learning on PET/CT data for the task of lesion segmentation.
Using only class labels (tumor vs. no tumor), a classifier was first trained and subsequently used to generate Class Activation Maps highlighting regions with lesions.
Based on these region proposals, final tumor segmentation could be performed with high accuracy in clinically relevant metrics.
This drastically simplifies the process of training data generation, as only class labels have to be assigned to each slice of a scan instead of a full pixel-wise segmentation.
To further reduce the time required to prepare training data, two self-supervised methods were investigated for the task of anatomical tissue segmentation and landmark detection.
To this end, as a second contribution, a state-of-the-art tracking framework based on contrastive random walks was transferred, adapted and extended to the medical imaging domain.
As contrastive learning often lacks real-time capability, a self-supervised template matching network was developed to address the task of real-time anatomical tissue tracking, yielding the third contribution of this work.
Both of these methods have in common that only during inference the object or region of interest is defined, reducing the number of required labels to as few as one and allowing adaptation to different tasks without having to re-train or access the original training data.
Despite the limited amount of labelled data, good results could be achieved for both tracking of organs across subjects as well as tissue tracking within time-series.
State-of-the-art self-supervised learning in medical imaging is usually performed on 2D slices due to the lack of training data and limited computational resources.
To exploit the three-dimensional structure of this type of data, self-supervised contrastive learning was performed on entire volumes using over 40,000 whole-body MRI scans forming the fourth contribution.
Due to this pre-training, a large number of downstream tasks could be successfully addressed using only limited labelled data.
Furthermore, the learned representations allows to visualize the entire dataset in a two-dimensional view.
To encourage research in the field of automated lesion segmentation in PET/CT image data, the autoPET challenge was organized, which represents the fifth contribution.