*Result*: FANS: A framework for automatic assessment of nutritional status based on free-text clinical notes.
*Further Information*
*Background: The prevalence of malnutrition is common in hospitalized patients. Timely and efficient nutrition support can improve clinical outcomes and save medical care costs. However, the current assessment of nutritional status relies on the one-by-one bedside patient interview, which is labor-intensive and time-consuming. Machine learning has shown its promise in clinical decision support, but it is challenged by the limited labeled samples in the real clinical use scenario.
Objective: We aimed to develop and validate an approach that automatically identifies malnutrition risk factors from free-text clinical notes, and assess nutritional status of hospitalized patients rapidly under the situation of limited labeled data.
Methods: The clinical notes of 1,469 patients used in this study were collected from the Peking Union Medical College Hospital. 495 hospitalized patients were recruited on admission and 974 patients were retrospectively collected. The proposed Framework for automatic Assessment of Nutritional Status (FANS) consists of two components: a risk factor identification component and a nutritional status assessment component. For risk factor identification, the notes were annotated according to the criteria of the global leadership initiative on malnutrition (GLIM), the subjective global assessment (SGA), and the European society for clinical nutrition and metabolism (ESPEN) guidelines for nutrition screening 2002 (NRS2002). Six natural language processing models were performed and compared. For nutritional status assessment, a semi-supervised learning (SSL) mechanism was applied based on the self-learning algorithm (SLA). The Shapley Additive exPlanations (SHAP) method was further employed to explore the interpretability.
Results: For risk factor identification, all pre-trained models outperformed the traditional machine learning models, and the best performance achieved with an F1 score of 0.9142. For nutritional status assessment, the SLA-based SSL improved the performance by introducing high confidence samples, and the area under the receiver operating characteristic curve (AUROC) for the optimized model achieved 0.8234. The unintentional weight loss and decreased food intake were the most impactful features recorded in clinical notes identified by the SHAP analysis.
Conclusion: This study presented a feasible NLP-based approach for nutritional risk assessment, and demonstrated the potential of machine learning for high-performance healthcare artificial intelligence services.
(Copyright © 2025 The Authors. Published by Elsevier B.V. All rights reserved.)*
*Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.*