Whole slide image-level classification of malignant effusion cytology using clustering-constrained attention multiple instance learning
Pleural fluid cytology plays a critical role in the early detection and diagnosis of lung cancer, but its effectiveness is often hindered by low diagnostic accuracy and significant interobserver variability. While artificial intelligence (AI)-based approaches have been introduced to overcome these challenges, most existing models operate at the image-patch level rather than the whole-slide image (WSI) level. To address this, this study developed a WSI-level classification model for malignant effusions using a large, quality-controlled, nationwide dataset consisting of 576 benign and 309 cancer WSIs from pleural fluids. A clustering-constrained attention multiple-instance learning (CLAM) framework was implemented to leverage slide-level labels effectively.
The proposed CLAM model demonstrated outstanding diagnostic performance, achieving a high accuracy of 97% and an area under the curve (AUC) of 0.97, which represents a 13% improvement over conventional image patch classification model-based WSI classification. Furthermore, the model significantly reduced both analysis time and computational resource requirements compared to traditional patch-level methods and full-scale heatmap generation. These results successfully demonstrate that WSI-level classification using the CLAM framework can accurately differentiate malignant effusions. This approach stands as a powerful tool to enhance the precision of cytopathological diagnostics, reduce variability due to human subjectivity, and optimize clinical workflows.
Lung Cancer (2025) 108552 ;https://doi.org/10.1016/j.lungcan.2025.108552