DHQ: Digital Humanities Quarterly
Editorial

Responsible AI and the Middle Ages: Detecting Historical Toxicity in Medieval Datasets

DOI: pending

Abstract

The increasing reliance on open-source datasets for training large language models has revealed a critical oversight in artificial intelligence development: the presence of historical toxicity embedded within canonical literary texts. This article examines the application of contemporary toxicity detection models to the Chanson de Roland, one of medieval French foundational literary works, exposing significant challenges in identifying hate speech, violence advocacy, and discriminatory content within historical documents. Using the multilingual Detoxify model on Joseph Bédier's modern French translation, I analyze 2,605 sentences to assess how toxicity detection models trained primarily on contemporary social media content perform when evaluating medieval literature. My findings reveal a troubling pattern: While the model successfully flags some explicit threats and insults, it systematically fails to detect the text's most problematic content, including religious misrepresentation, forced conversion narratives, and anti-Black racism. These “false negatives” represent a fundamental problem for AI models being trained on historical open data, as they risk perpetuating and amplifying centuries-old discriminatory frameworks while appearing objective. This research contributes to urgent debates about responsible AI development, arguing that without human-annotated ground truth datasets specifically designed for historical texts toxicity, current LLMs risk distorting historical understanding and diffusing undetected violent and racist discrimination at unprecedented scale. I advocate for interdisciplinary collaboration between computer scientists and humanities scholars to develop ethical frameworks for curating historical datasets that acknowledge their toxic content without either sanitizing history or amplifying historical harm.

Content goes here!

Works Cited

PLACEHOLDER