Özgür Güler: Explaining CNN-Based Active Tuberculosis Detection in Chest X-Rays through Saliency Mapping Techniques

Posted on Fri 01 September 2023 in theses

In this figure, we show (top row) Saliency maps for the unbalanced model M_U featuring the five cases from the TBX11k test set with the lowest Proportional Energy scores; (bottom row) respective predictions of our best balanced model M_B_B. Human-annotated ground-truth regions including radiological signs are indicated by bright magenta bounding boxes. The heatmaps (ranging from red to blue) indicate the contribution of different regions to the models' decision-making, with non-colored areas having no significant contribution.

Tuberculosis (TB) is an infectious disease caused by the bacterium Mycobacterium tuberculosis, which is one of the leading causes of death worldwide. Various Deep Convolutional Neural Network models have gained popularity to help during the TB screening process by detecting patients with active Tuberculosis from their Chest X-Rays. To help with further advancing the research, a new publicly available dataset, TBX11K, has been used to increase the number of sam- ples during training for existing replaceable state-of-the-art models. In the first step, the model's performance was evaluated to see if an improvement through the addition of more TB-related data was observable. It was shown that state-of-the-art replicable binary classifier models could further be improved through the inclusion of more data. Further, there is a lack of focus on generating and evaluating explanations for such models. The preferred methods currently are saliency mapping techniques such as Grad-CAM, to generate visual explanations based on the model's decision-making process, by overlaying heatmaps over the Chest X-Rays. The selected TBX11K dataset includes ground truth bounding box labels, which makes it possible to evaluate if the visualisations were correct. There are various evaluation metrics to evaluate the faithful- ness and localisation performance of the saliency mapping techniques according to ground truth labels. Two of them have been identified to be useful, namely RemOve and Debias, and Pro- portional Energy. RemOve and Debias was used to observe if there is one universal saliency mapping technique that performs well for all models for the task of active Tuberculosis detection. Further, based on these two metrics, a new metric was proposed, ROAD-Normalised PropEng Average, to measure the overall best-performing model and Saliency Mapping Technique com- bination. From the evaluation with RemOve and Debias, it was concluded that there does not seem to be a universal saliency mapping technique that performs well on all model architectures for the detection of active Tuberculosis. Thus, it is recommended to always consider the under- lying model before choosing the optimal saliency mapping technique. Further, through the use of the ROAD-Normalised PropEng Average, it was concluded that one model in combination with a saliency mapping technique offered the best trade-off between faithfulness and correct- ness of the visualisations. This was the multi-label DenseNet-121 model with Eigen-CAM. To obtain accurate classifications of active Tuberculosis with explainable and correct visualisations, it is recommended to use this model and visualisation technique combination.

Reproducibility Checklist

Thesis report

Software is based on the open-source mednet library. N.B.: Software leading to these results was integrated into the Medical AI Group software stack.

All databases are publicly available

An article with results from this thesis was published on a conference [1]

Bibliography

[1] Özgür Güler, Manuel Günther, and André Anjos. Refining tuberculosis detection in cxr imaging: addressing bias in deep neural networks via interpretability. In Proceedings of the 12th European Workshop on Visual Information Processing. September 2024.

@inproceedings{euvip-2024,
    author = {G{\"{u}}ler, {\"{O}}zg{\"{u}}r and G{\"{u}}nther, Manuel and Anjos, Andr{\'{e}}},
    month = "September",
    title = "Refining Tuberculosis Detection in CXR Imaging: Addressing Bias in Deep Neural Networks via Interpretability",
    booktitle = "Proceedings of the 12th European Workshop on Visual Information Processing",
    year = "2024",
    abstract = "Automatic classification of active tuberculosis from chest X-ray images has the potential to save lives, especially in low- and mid-income countries where skilled human experts can be scarce. Given the lack of available labeled data to train such systems and the unbalanced nature of publicly available datasets, we argue that the reliability of deep learning models is limited, even if they can be shown to obtain perfect classification accuracy on the test data. One way of evaluating the reliability of such systems is to ensure that models use the same regions of input images for predictions as medical experts would. In this paper, we show that pre-training a deep neural network on a large-scale proxy task, as well as using mixed objective optimization network (MOON), a technique to balance different classes during pre-training and fine-tuning, can improve the alignment of decision foundations between models and experts, as compared to a model directly trained on the target dataset. At the same time, these approaches keep perfect classification accuracy according to the area under the receiver operating characteristic curve (AUROC) on the test set, and improve generalization on an independent, unseen dataset. For the purpose of reproducibility, our source code is made available online.",
    pdf = "https://publications.idiap.ch/attachments/papers/2024/Guler\\_EUVIP24\\_2024.pdf"
}