Opportunities of natural language processing for comparative judgment assessment of essays

Michiel De Vrindt*, Anaïs Tack, Wim Van den Noortgate, Marije Lesterhuis, Renske Bouwer

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Comparative judgment (CJ) is an assessment method commonly used for assessing essay quality, where assessors compare pairs of essays and judge which essays are superior in quality. A psychometric model is used to convert judgments into quality scores. Although CJ yields reliable and valid scores, its widespread implementation in educational practice is hindered by its inefficiency and limited feedback capabilities. This conceptual study explores how Natural Language Processing (NLP) can address these limitations, drawing upon existing NLP techniques and the very limited research on their integration within CJ. More specifically, we argue that, at the start of the assessment, initial essay quality scores could be predicted from essay texts using NLP, mitigating the cold-start problem of CJ. During the CJ assessment, selection rules could be constructed using NLP to efficiently increase the reliability of the scores while supporting assessors by not letting them make too difficult comparisons. After the CJ assessment, NLP could automate feedback, helping to better understand how assessors arrived at their judgments and explaining the scores to assessees (students). To support future research, we overview appropriate methods based on existing research and highlight important considerations for each opportunity. Ultimately, we contend that integrating NLP into CJ can significantly improve the efficiency and transparency of the assessment method, all while preserving the crucial role of human assessors in evaluating writing quality.

Original languageEnglish
Article number100414
Number of pages12
JournalComputers and Education: Artificial Intelligence
Volume8
DOIs
Publication statusPublished - Jun 2025

Keywords

  • Automated essay scoring
  • Comparative judgment
  • Hybrid human-AI
  • Natural language processing
  • Partial-automation

Fingerprint

Dive into the research topics of 'Opportunities of natural language processing for comparative judgment assessment of essays'. Together they form a unique fingerprint.

Cite this