Detecting Medical Misinformation for Clinical Decision Support: A Robust Neural Approach with Consistency Regularization

The rapid and widespread dissemination of medical misinformation on social media platforms poses a significant threat to public health, leading to treatment delays, vaccine hesitancy, and the adoption of harmful practices. Addressing the challenges of overfitting and prediction instability is crucial when fine-tuning pre-trained language models for Chinese microblog health rumor detection, especially given the short, emotionally charged, and noisy nature of public health surveillance data.This paper aims to propose a simple yet effective training strategy to enhance model generalization and robustness in detecting Chinese microblog health rumors. By integrating R-Drop consistency regularization with robust loss functions, this study seeks to establish a stronger and more reliable performance baseline for identifying and mitigating the impact of harmful health-related rumors, thereby contributing to a healthier online information ecosystem.We fine-tuned two powerful pre-trained Chinese language models, RoBERTa-wwm-ext and MacBERT, using a large corpus from the public Chinese Emergency Corpus (CED) dataset. Our approach introduced R-Drop consistency regularization, which constrains prediction consistency through dual forward passes with dropout and minimizes symmetric Kullback-Leibler (KL) divergence. To handle noisy labels and class imbalance, we compared standard Cross-Entropy loss against Focal Loss and Symmetric Cross-Entropy (SCE), employing a stable learning rate schedule and early stopping to prevent overfitting.The MacBERT model consistently outperformed RoBERTa-wwm-ext in health rumor detection. R-Drop consistency regularization yielded stable and significant improvements across all configurations, enhancing both accuracy and F1-score. The optimal configuration, pairing MacBERT with standard Cross-Entropy loss and R-Drop, achieved a peak test accuracy of 0.9198. Notably, a well-regularized standard Cross-Entropy loss proved more effective for this dataset and task than the specialized robust loss functions.This research successfully establishes that integrating R-Drop consistency regularization offers a computationally efficient, easily implementable, and highly effective training paradigm for advancing Chinese microblog health rumor detection. The proposed methodology provides a strong, reproducible, and high-performing baseline, offering valuable guidance for public health authorities and social media platforms in combating medical misinformation and enhancing automated systems for safeguarding public health.