*Result*: Reproducing real-world clinical prediction models using the DIVE platform: A comparative validation study across three chronic diseases.

Title:
Reproducing real-world clinical prediction models using the DIVE platform: A comparative validation study across three chronic diseases.
Authors:
Lapi F; Genomedics Srl, Florence, Italy. Electronic address: francesco.lapi@genomedics.it., Marconi E; Genomedics Srl, Florence, Italy., Gorini M; Innovation Unit & Business Excellence, AstraZeneca, Milan, Italy., Nuti L; Genomedics Srl, Florence, Italy., Medea G; Genomedics Srl, Florence, Italy., Cricelli I; Genomedics Srl, Florence, Italy.
Source:
International journal of medical informatics [Int J Med Inform] 2026 Apr 15; Vol. 210, pp. 106303. Date of Electronic Publication: 2026 Jan 22.
Publication Type:
Journal Article; Comparative Study; Validation Study
Language:
English
Journal Info:
Publisher: Elsevier Science Ireland Ltd Country of Publication: Ireland NLM ID: 9711057 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1872-8243 (Electronic) Linking ISSN: 13865056 NLM ISO Abbreviation: Int J Med Inform Subsets: MEDLINE
Imprint Name(s):
Original Publication: Shannon, Co. Clare, Ireland : Elsevier Science Ireland Ltd., c1997-
Contributed Indexing:
Keywords: Analyses platform; Data analysis; Real-world data; Real-world evidence
Entry Date(s):
Date Created: 20260127 Date Completed: 20260223 Latest Revision: 20260223
Update Code:
20260223
DOI:
10.1016/j.ijmedinf.2026.106303
PMID:
41592418
Database:
MEDLINE

*Further Information*

*Objectives: The aim of this analysis is to evaluate the performance and reproducibility of the Python-based Data Insight Validation Engine (DIVE), a modular analytics interface implemented in Python to facilitate real-world evidence (RWE) generation from clinical (e.g. primary care) data. The platform was used to replicate three previously published studies focused on chronic kidney disease (CKD), chronic obstructive pulmonary disease (COPD), and severe asthma, each originally developed using conventional statistical environments.
Methods: Using a primary care data source, DIVE was employed to replicate three studies on development and validation of prediction scores using machine learning (ML) and traditional inferential analyses. Namely, a ML-based Generalized Additive<sup>2</sup> Model (GA<sup>2</sup>M) predicting CKD, and two Cox-based regression models for COPD exacerbations (CEX-HScore) and severe asthma (AS-HScore). Data referred to over one million patients under the care of approximately 800 general practitioners (GPs) in Italy. Although the initial studies were carried out between 2013 and 2021, the DIVE-based investigations extended from 2013 to 2022, thereby also demonstrating "external" temporal validation. Results obtained via DIVE were compared to the "original" prior findings.
Results: DIVE demonstrated high fidelity in replicating published results. The CKD model achieved largely consistent discrimination (AUC: 89.2% vs. 89.3%) and average precision (22.1% vs. 22.4%) using GA<sup>2</sup>M. The COPD model showed AUC of 65.5%, pseudo-R<sup>2</sup> of 12.7%, and calibration slope of 1.01 (p = 0.317) which were consistent with original CEX-HScore (AUC: 66%; pseudo-R<sup>2</sup>: 13%; calibration slope: 1.03 (p = 0.345)). For severe asthma, the prediction model exhibited an AUC equals to 71.9%, pseudo-R<sup>2</sup> of 17.6%, and calibration slope of 1.09 (p = 0.211), still aligned with the original AS-HScore (AUC: 72.5%; pseudo-R<sup>2</sup>: 18%; calibration slope: 1.12 (p = 0.182)).
Conclusion: DIVE represents a reliable, scalable, and interoperable solution for RWE analytics, demonstrating equivalence with traditional analytic methods and aligning with best practices in data reproducibility. Continued development toward integrating federated (multi-database) analyses protocols and broader interoperability might expand its utility across several clinical domains.
(Copyright © 2026 Elsevier B.V. All rights reserved.)*

*Declaration of competing interest The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: FL, EM, and IC provided consultations in protocol preparation for epidemiological studies and data analyses for AstraZeneca, Boehringer Ingelheim, GSK and Chiesi. GM provided clinical consultancies for AstraZeneca, Boehringer Ingelheim, Novo Nordisk, GSK, and Chiesi. MG is an employee at AstraZeneca.*