Self-Supervised Vision Transformers for Medical Image Segmentation with Limited Annotations

Priya Patel; Xiaofeng Liu; Thomas Müller

doi:10.55001/faids.v1i1.44

综述文章

Self-Supervised Vision Transformers for Medical Image Segmentation with Limited Annotations

Priya Patel , Xiaofeng Liu , Thomas Müller

已出版: 2026-05-01 DOI: https://doi.org/10.55001/faids.v1i1.44 卷 1 期 1 (2026)

— Views — Downloads

摘要

Annotating medical images for segmentation is expensive and requires domain expertise. We propose MedSSL-ViT, a self-supervised pre-training framework for Vision Transformers (ViT) tailored to medical imaging. MedSSL-ViT combines masked image modeling with anatomical-aware contrastive learning, leveraging the structured nature of medical images. Pre-trained on 850K unlabeled chest X-rays and CT slices, the model achieves state-of-the-art segmentation performance on four downstream tasks using only 10% of annotations: lung segmentation (Dice: 97.2%), cardiac chamber segmentation (Dice: 93.5%), liver tumor segmentation (Dice: 78.8%), and retinal vessel segmentation (Dice: 82.1%). With just 1% labels, MedSSL-ViT still outperforms fully supervised baselines trained on 100% labels by 2-5% Dice score.

作者简介

Priya Patel Department of Biomedical Informatics, Stanford University, Stanford, CA 94305, USA

Priya Patel is an associate professor at Department of Biomedical Informatics, Stanford University, Stanford, CA 94305, USA. Their research focuses on biomedical engineering, with over 31 publications in peer-reviewed journals.
Xiaofeng Liu School of Computer Science, Fudan University, Shanghai 200433, China

Xiaofeng Liu is an associate professor at School of Computer Science, Fudan University, Shanghai 200433, China. Their research focuses on energy systems, with over 70 publications in peer-reviewed journals.
Thomas Müller German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany

Thomas Müller is a research fellow at German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany. Their research focuses on energy systems, with over 71 publications in peer-reviewed journals.

在线阅读下载 PDF

Self-Supervised Vision Transformers for Medical Image Segmentation with Limited Annotations. (2026). 人工智能与数据科学前沿, 1(1). https://doi.org/10.55001/faids.v1i1.44

Endnote/Zotero/Mendeley (RIS) BibTeX

Priya Patel Department of Biomedical Informatics, Stanford University, Stanford, CA 94305, USA
Xiaofeng Liu School of Computer Science, Fudan University, Shanghai 200433, China
Thomas Müller German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany