TY - JOUR
T1 - Artificial Intelligence for early detection of lung cancer in General Practitioners' clinical notes
T2 - a retrospective observational cohort study
AU - Schut, Martijn C.
AU - Luik, Torec T.
AU - Vagliano, Iacopo
AU - Rios, Miguel
AU - Helsper, Charles W.
AU - van Asselt, Kristel M.
AU - de Wit, Niek
AU - Abu-Hanna, Ameen
AU - van Weert, Henk C.P.M.
N1 - Publisher Copyright:
© 2025 Royal College of General Practitioners. All rights reserved.
PY - 2025/5
Y1 - 2025/5
N2 - Background The journey of >80% of patients diagnosed with lung cancer starts in general practice. About 75% of patients are diagnosed when it is at an advanced stage (3 or 4), leading to >80% mortality within 1 year at present. The long-term data in GP records might contain hidden information that could be used for earlier case finding of patients with cancer. Aim To develop new prediction tools that improve the risk assessment for lung cancer. Design and setting Text analysis of electronic patient data using natural language processing and machine learning in the general practice files of four networks in the Netherlands. Method Files of 525 526 patients were analysed, of whom 2386 were diagnosed with lung cancer. Diagnoses were validated by using the Dutch cancer registry, and both structured and free-text data were used to predict the diagnosis of lung cancer 5 months before diagnosis (4 months before referral). Results The algorithm could facilitate earlier detection of lung cancer using routine general practice data. Discrimination, calibration, sensitivity, and specificity were established under various cut-off points of the prediction 5 months before diagnosis. Internal validation of the best model demonstrated an area under the curve of 0.88 (95% confidence interval [CI] = 0.86 to 0.89), which shrunk to 0.79 (95% CI = 0.78 to 0.80) during external validation. The desired sensitivity determines the number of patients to be referred to detect one patient with lung cancer. Conclusion Artificial intelligence-based support enables earlier detection of lung cancer in general practice using readily available text in the patient files of GPs, but needs additional prospective clinical evaluation.
AB - Background The journey of >80% of patients diagnosed with lung cancer starts in general practice. About 75% of patients are diagnosed when it is at an advanced stage (3 or 4), leading to >80% mortality within 1 year at present. The long-term data in GP records might contain hidden information that could be used for earlier case finding of patients with cancer. Aim To develop new prediction tools that improve the risk assessment for lung cancer. Design and setting Text analysis of electronic patient data using natural language processing and machine learning in the general practice files of four networks in the Netherlands. Method Files of 525 526 patients were analysed, of whom 2386 were diagnosed with lung cancer. Diagnoses were validated by using the Dutch cancer registry, and both structured and free-text data were used to predict the diagnosis of lung cancer 5 months before diagnosis (4 months before referral). Results The algorithm could facilitate earlier detection of lung cancer using routine general practice data. Discrimination, calibration, sensitivity, and specificity were established under various cut-off points of the prediction 5 months before diagnosis. Internal validation of the best model demonstrated an area under the curve of 0.88 (95% confidence interval [CI] = 0.86 to 0.89), which shrunk to 0.79 (95% CI = 0.78 to 0.80) during external validation. The desired sensitivity determines the number of patients to be referred to detect one patient with lung cancer. Conclusion Artificial intelligence-based support enables earlier detection of lung cancer in general practice using readily available text in the patient files of GPs, but needs additional prospective clinical evaluation.
KW - early detection
KW - general practice
KW - lung cancer
KW - machine learning
KW - natural language processing
KW - oncology
UR - http://www.scopus.com/inward/record.url?scp=105004377121&partnerID=8YFLogxK
U2 - 10.3399/BJGP.2023.0489
DO - 10.3399/BJGP.2023.0489
M3 - Article
C2 - 40044183
SN - 0960-1643
VL - 75
SP - e316-e322
JO - The British journal of general practice : the journal of the Royal College of General Practitioners
JF - The British journal of general practice : the journal of the Royal College of General Practitioners
IS - 754
M1 - doi.org/10.3399/BJGP.2023.0489
ER -