Equivalency of the diagnostic accuracy of the PHQ-8 and PHQ-9: a systematic review and individual participant data meta-analysis

Yin Wu, Brooke Levis, Kira E Riehm, Nazanin Saadat, Alexander W Levis, Marleine Azar, Danielle B Rice, Jill Boruff, Pim Cuijpers, Simon Gilbody, John P A Ioannidis, Lorie A Kloda, Dean McMillan, Scott B Patten, Ian Shrier, Roy C Ziegelstein, Dickens H Akena, Bruce Arroll, Liat Ayalon, Hamid R BaradaranMurray Baron, Charles H Bombardier, Peter Butterworth, Gregory Carter, Marcos H Chagas, Juliana C N Chan, Rushina Cholera, Yeates Conwell, Janneke M de Man-van Ginkel, Jesse R Fann, Felix H Fischer, Daniel Fung, Bizu Gelaye, Felicity Goodyear-Smith, Catherine G Greeno, Brian J Hall, Patricia A Harrison, Martin Härter, Ulrich Hegerl, Leanne Hides, Stevan E Hobfoll, Marie Hudson, Thomas Hyphantis, M D Inagaki, Nathalie Jetté, Mohammad E Khamseh, Kim M Kiely, Yunxin Kwan, Femke Lamers, Shen-Ing Liu, Manote Lotrakul, Sonia R Loureiro, Bernd Löwe, Anthony McGuire, Sherina Mohd-Sidik, Tiago N Munhoz, Kumiko Muramatsu, Flávia L Osório, Vikram Patel, Brian W Pence, Philippe Persoons, Angelo Picardi, Katrin Reuter, Alasdair G Rooney, Iná S Santos, Juwita Shaaban, Abbey Sidebottom, Adam Simning, M D Stafford, Sharon Sung, Pei Lin Lynnette Tan, Alyna Turner, Henk C van Weert, Jennifer White, Mary A Whooley, Kirsty Winkley, Mitsuhiko Yamada, Andrea Benedetti, Brett D Thombs

Research output: Contribution to journalReview articlepeer-review

Abstract

BACKGROUND: Item 9 of the Patient Health Questionnaire-9 (PHQ-9) queries about thoughts of death and self-harm, but not suicidality. Although it is sometimes used to assess suicide risk, most positive responses are not associated with suicidality. The PHQ-8, which omits Item 9, is thus increasingly used in research. We assessed equivalency of total score correlations and the diagnostic accuracy to detect major depression of the PHQ-8 and PHQ-9.

METHODS: We conducted an individual patient data meta-analysis. We fit bivariate random-effects models to assess diagnostic accuracy.

RESULTS: 16 742 participants (2097 major depression cases) from 54 studies were included. The correlation between PHQ-8 and PHQ-9 scores was 0.996 (95% confidence interval 0.996 to 0.996). The standard cutoff score of 10 for the PHQ-9 maximized sensitivity + specificity for the PHQ-8 among studies that used a semi-structured diagnostic interview reference standard (N = 27). At cutoff 10, the PHQ-8 was less sensitive by 0.02 (-0.06 to 0.00) and more specific by 0.01 (0.00 to 0.01) among those studies (N = 27), with similar results for studies that used other types of interviews (N = 27). For all 54 primary studies combined, across all cutoffs, the PHQ-8 was less sensitive than the PHQ-9 by 0.00 to 0.05 (0.03 at cutoff 10), and specificity was within 0.01 for all cutoffs (0.00 to 0.01).

CONCLUSIONS: PHQ-8 and PHQ-9 total scores were similar. Sensitivity may be minimally reduced with the PHQ-8, but specificity is similar.

Original languageEnglish
Pages (from-to)1368-1380
Number of pages13
JournalPsychological Medicine
Volume50
Issue number8
Early online date12 Jul 2019
DOIs
Publication statusPublished - Jun 2020

Keywords

  • Depression
  • diagnostic accuracy
  • individual participant data meta-analysis
  • meta-analysis
  • PHQ-8
  • PHQ-9
  • screening
  • systematic review

Fingerprint

Dive into the research topics of 'Equivalency of the diagnostic accuracy of the PHQ-8 and PHQ-9: a systematic review and individual participant data meta-analysis'. Together they form a unique fingerprint.

Cite this