Articles
Do all politicians sound the same? Comparing model explanations to human responsesOtto Tarkka, University of Turku; Kimmo Elo, ; Filip Ginter, ; Veronika Laippala,
Abstract
[en]
It is sometimes said that all politicians sound the same with their speeches mired
in political jargon full of clichés and false promises. To investigate how distinct
the plenary speeches of political parties truly are and what linguistic features make
them distinct, we trained a BERT classifier to predict the party affiliation of Finnish
members of parliament from their plenary speeches. We contrasted and compared model
performance to human responses to see how humans and the model differ in their ability
to distinguish between the parties. We used the model explainability method SHAP to
identify the linguistic cues that the model most relies on. We show that a deep learning
model can distinguish between parties much more accurately than the respondents to
the questionnaire. The SHAP explanations and questionnaire responses reveal that whereas
humans tend to rely on mostly topical cues, the model has learned to recognize other
cues as well, such as personal style and rhetoric.
The Eras Tour: Machine Learning for Dating
Historical Texts from Greco-Roman EgyptDanushka Bandara, Fairfield University; Fatima Chowdhury, Fairfield University; John
Stow, Fairfield University; Adrian Gallant, Fairfield University; Habibul Huq, Fairfield
University; Giovanni Ruffini, Fairfield University
Abstract
[en]
Accurate dating of historical texts is essential for understanding cultural and
historical narratives. However, traditional methods, such as paleographic and
physical examination, can be subjective, costly, and potentially damaging to
manuscripts. This paper introduces a machine learning approach to predicting the
authorship dates of historical texts by using named entities — specifically,
person and place names — as temporal markers. Using a dataset from Trismegistos,
which includes metadata on the earliest and latest possible writing dates, we
apply regression models to estimate text origins. While linear models like Lasso
and Ridge Regression showed limited success, nonlinear models, including Random
Forest, XGBoost, and Neural Networks, performed significantly better, with
ensemble methods delivering the best results. The top-performing ensemble model
achieved a mean absolute error of 45.7 years, surpassing traditional techniques.
This study demonstrates the potential of named entities as temporal indicators
and the effectiveness of ensemble learning in capturing complex historical
patterns. Offering a scalable, non-destructive alternative to traditional
methods.
Open Tool Registries! Resolving the Directory
Paradox with Wikidata Till Grallert, Humboldt-Universität zu Berlin; Sophie Eckenstaler, Deutsches Dokumentationszentrum
für Kunstgeschichte – Bildarchiv Foto Marburg; Claus-Michael Schlesinger; Nicole Dresselhaus,
Humboldt-Universität zu Berlin; Isabell Trilling, Humboldt-Universität zu Berlin
Abstract
[en]
This paper introduces the conceptual framework for open and community-curated
tool registries, posing that such registries provide fundamental value to any
field of research by acting as curated knowledge bases about a community’s past
and current methodological practices as well as authority files for individual
tools. The modular framework of a basic data model, SPARQL queries, bash
scripts, and a prototypical web interface builds upon the well-established and
open infrastructures of Wikimedia, GitLab, and Zenodo for creating, maintaining,
sharing, curating, and archiving linked open data. We demonstrate the
feasibility of this framework by introducing our concrete implementation of a
tool registry for digital humanities, initially repurposing data from existing
silos, such as TAPoR and the SSH Open Marketplace, and retaining the established
TaDiRAH classification scheme while being open to communal editing in every
aspect.
Polysemicolon; Novice Programmers and Java KeywordsBriana Bettin, Michigan Technological University
Abstract
[en]
Industry standard programming languages often leverage the English language for reserved
keywords – words interpreted as specific execution commands for the compiled program.
The Java programming language is no exception to using English reserved keywords,
and is used widely in industrial and educational settings.
The expert-novice programmer divide exemplifies an intriguing middle-ground for navigating
metaphor and highlighting polysemic interpretations of keywords. For experts, keywords
become “dead metaphor” (drawn from “career of metaphor” theory). That is, the expert
sees the keyword – often the entire grammatical construct with it – and derives programmatic
meaning near instantly. For the expert, there is rare consideration of alternative
English-language interpretations. Novices however, in attempting to first navigate
programming, may use these English keywords as familiar landmarks in unfamiliar terrain.
Amidst a sea of semicolons, single letter variables, and math operators, they may
gravitate to familiar words such as “if” or “while” to derive meaning.
Attempts by novices to create meaning using these English definitions can, however,
result in potential misconceptions. The word “for” as a preposition has over a dozen
distinct definitions. Which definition should a novice programmer use to achieve understanding
in learning to program, and what misconceptions may they develop through alternatives
to the “correct” choice? For some keywords, there may be no completely “correct” definition.
While this ambiguity can create a myriad of interpretations for critical code studies,
it can provide pitfalls for those first learning to program. This essay samples several
keyword interpretations that novice programmers may derive from the Java language’s
keywords and how polysemous meaning may affect their interpretation. Through observation
of students in their CS1 class, the author began exploring how polysemy, linguistics,
and metaphoric interpretations may affect understanding in beginner courses. These
students are largely native English speakers, highlighting that understanding rifts
exist even for native and colloquial speakers.
Code snippets are explored with both compiled keyword meanings and potential understood
meanings. This provides insight into pathways for student reasoning and navigation
in the programming landscape. The myriad of potential conclusions or definitions are
contrasted against the compiler’s singular interpretation, and how the polysemic potential
of natural language falls to singular dead metaphor in experts. This stark difference
between natural linguistics, critical code analysis, and compiled code meaning highlights
contrasts between programming and natural languages, in addition to highlight paradigm
shifts that may occur in pursuit of expertise.
Rhetorical Strategies of Naming Practices in CodeKevin Brock, University of South Carolina
Abstract
[en]
The rhetorical significance of naming practices is widely understood, but it — and
many other rhetorical dimensions of language — are often overlooked in the domain
of software development, especially in regards to code languages and relevant practices
(as demonstrated in file names, functions, variables, and so on). While naming conventions
in code are typically recognized as inherently arbitrary, they are also tangled up
in numerous networks of community expectations, constraints, and mores, whether organizational
or interpersonally social in nature. Given Kenneth Burke's argument for the revealing
and concealing influences of terministic screens upon our engagement with the world
(by establishing ways of seeing and not seeing), naming conventions in code play an
important role in how meaningful invention occurs for human developers and readers
of code files. Despite the apparent triviality of such a component of software projects,
naming practices shine a light on the goals and values of a programmer in addition
to the functional intentions that they might have for the use of a given body of code.
The Epistemology of Code in the Age of Machine LearningEvan Buswell
Abstract
[en]
Code is an epistemic system predicated on the repression of state, but with the rise
of global optimization and machine learning algorithms, code functions just as much
to obscure knowledge as to reveal it. Code is constructed in response to two characteristics
of the twentieth century episteme. First, knowledge is represented as a process. Second,
this representation must be sufficient, such that its meaning is constituted by the
representational form itself. In attempting to meet these requirements, process is
separated into an essential part, code, and an inessential part, state. Although code
has a relationship with state, in order to construct code as an epistemic object,
state is limited and suppressed. This construction begins with the first formation
of code in the 1940s and reaches its modern form in the structured programming movement
of the later 1960s. But now, with the increasing dominance of global optimization
and machine learning algorithms in computing, it has become apparent that state is
vitally important, and yet our tools for understanding state are inadequate. This
epistemic inadequacy nevertheless serves those who would act dangerously and shun
responsibility for the consequences.
Playing in the Gap: Analog Programming and the First Video Game ConsoleZachary Horton, University of Pittsburgh; Levi Burner, University of Maryland
Reading Note G: Ada Lovelace and the Clerical Labor
of CodeworkZachary Mann, University of Southern California
Abstract
[en]
This article examines human-machine co-authorship as it represented by Ada Lovelace
in her famous translation of and appendices to L. F. Menabrea’s “Sketch of The Analytical
Engine Invented by Charles Babbage.” Lovelace's
translation notes and correspondence with Charles Babbage are read alongside the
Engine itself, as a platform, and the histories of calculating engines, software
development, and nineteenth-century clerical labor. Inspired by critical code
studies, the article performs a close reading of what is today referred to as the
“first computer program,” a sequence of steps that Lovelace
adds to her translation’s final “Note G” as an example of
something the Engine can do. Ultimately, the article argues that the gendered power
structures of collaborative work in Lovelace’s time — and the challenges women
authors faced in the nineteenth century more broadly — influenced understandings of
machine programming, and that Lovelace’s representation of the human-machine
relationship in the first programmable calculating machine complicates the
organizational structures of both clerical labor and software design.
Defactoring Pace of ChangeMatt Burton, University of Pittsburgh; Joris Van Zundert, Huygens Institute of the
Royal Netherlands Academy of Arts and Sciences
Abstract
[en]
Code, the symbolic representation of computer instructions driving software, has long
been a part of research methods in literary scholarship. However, the bespoke code
of
data and computationally inflected Digital Humanities research is not always a part
of the final publication. We emphasize the need to elevate code from its generally
invisible status in scholarly publications and make it a visible research output.
We
highlight the lack of conventions and practices for theorizing, critiquing, and peer
reviewing bespoke code in the humanities, as well as the insufficient support the
dissemination and preservation of code in scholarly publishing. We introduce “defactoring”
as a method for analyzing and reading code used
in humanities research and present a case study of applying this technique to a
publication from literary studies. We explore the implications of code as methodology
made material, advocating for a more integrated and computationally informed mode
of
interacting with scholarship. We conclude by posing questions about the potential
benefits and challenges of linking code and theoretical exposition to foster a more
robust scholarly dialogue.
Evaluating and Understanding the Geocoding of City
Directories of Paris (1787-1914): Data-Driven Geography of Urban Sprawl and
DensificationJulie Gravier, Laboratoire ThéMA UMR 6049, CNRS, Université Marie et Louis Pasteur;
Stéphane Baciocchi, Centre de Recherches Historiques, EHESS-CNRS UMR 8558; Pascal
Cristofoli, Centre de Recherches Historiques, EHESS-CNRS UMR 8558; Bertrand Duménieu,
Centre de Recherches Historiques, EHESS-CNRS UMR 8558; Edwin Carlinet, Laboratoire
de Recherche de l'EPITA; Joseph Chazalon, Laboratoire de Recherche de l'EPITA; Nathalie
Abadie, Université Gustave Eiffel, ENSG, IGN, LASTIG; Solenn Tual, Université Gustave
Eiffel, ENSG, IGN, LASTIG; Julien Perret, Université Gustave Eiffel, ENSG, IGN, LASTIG
Abstract
[en]
As in other western cities, the fast-paced urban, industrial, and commercial sprawl
of Paris during the 19th century provided the backdrop and driving force for the
publishing phenomenon of trade directories. We show how these collections of millions
of nominative entries associated with addresses can be turned into a serial dataset
whose massive, fine-grained, and geolocated nature opens up new possibilities for
quantitative and multi-scale analyses of the dynamics at play during one of the most
dramatic socio-spatial transformations of the city. We highlight the methodological
conditions of such data-driven analyses and emphasize the importance of understanding
source effects. The findings underscore the significance of data science in
critically evaluating digital sources and adhering to best practices in the
production of large historical datasets.
Assemblies of Points: Strategies to
Art-historical Human Pose Estimation and RetrievalStefanie Schneider, LMU Munich
Abstract
[en]
This paper attempts to construct a virtual space of possibilities for the
historical embedding of the human figure, and its posture, in the visual arts by
proposing a view-invariant approach to Human Pose Retrieval (HPR) that resolves
the ambiguity of projecting three-dimensional postures onto their
two-dimensional counterparts. In addition, we present a refined approach for
classifying human postures using a support set of 110 art-historical reference
postures. The method’s effectiveness on art-historical images was validated
through a two-stage approach of broad-scale filtering preceded by a detailed
examination of individual postures: an aggregate-level analysis of
metadata-induced hotspots, and an individual-level analysis of topic-centered
query postures. As a case study, we examined depictions of the crucified, which
often adhere to a canonical form with little variation over time — making it an
ideal subject for testing the validity of Deep Learning (DL)-based methods.
A Critical Collection History of Nineteenth-century
Women’s Letters: Overcoming the Occluded Archive with Data-Driven
MethodsIlona Pikkanen, The Finnish Literature Society; Matti La Mela, Uppsala University;
Hanna-Leena Paloposki, Independent Scholar; Jouni Tuominen, University of Helsinki
and Aalto University
Abstract
[en]
This paper presents a “virtual archive” of women’s epistolary exchange in
19th-century Finland. By harmonising metadata from over 1.2 million letters and over
100,000 correspondents across key cultural heritage organisations and leveraging
linked open data, we gain an unprecedented view of 19th-century epistolary
communication and 20th-century archival practices. Using quantitative analysis,
enriched metadata, and network visualisations, we explore the gendered nature of
these collections. Are women archival protagonists, or are their materials embedded
within the collections of male relatives? Do the data reveal overlooked women with
extensive archival networks absent from historical narratives? We introduce the
framework of “critical collection history,” which combines theoretical debates
and research interests from critical archival studies and digital history and
combines them with contemporary digital methods. This approach underscores the
necessity for scholars using data-driven methods in historical research to critically
engage with digitised archives. Moreover, critical collection history highlights how
“big cultural heritage metadata” can expose archival biases and enhance our
understanding of source limitations – biases that digital scholarship may
unintentionally perpetuate.
Making Sense of the Emergence of Manslaughter in British Criminal JusticeTim Hitchcock, Professor Emeritus of Digital History, University of Sussex; William
J. Turkel, Professor of History, The University of Western Ontario
Abstract
[en]
Manslaughter emerged as a new and distinct category of crime amongst those tried at
the Old Bailey in London in the first half of the nineteenth century. From being a
rare charge in 1800, manslaughter came to represent over 60% of all trials for ‘killing’
by the 1850s. This article describes the methodologies used by the authors to explore
this phenomenon via trials included in the Old Bailey Online. It details the use of
unsupervised clustering, embeddings and relevance measures to map the changing language
associated with the charge of manslaughter; and more importantly, describes the application
of top-down ‘sense making’ methodologies to the resulting analysis. Along the way
it argues for the importance of including qualitative judgements by subject specialists
in the process of developing quantitative analyses. Inter Alia it suggests that the
rise of manslaughter was the result of a complex set of forces including changing
statute law, the rise of a professional police, changes in the administration of coroners’
courts, and a growing public intolerance of violence.
Image Reuse in Eighteenth-Century Book History: Large-Scale Data-Driven Study of Headpiece
Ornament VariantsRuilin Wang, University of Helsinki; Enes Yılandiloğlu, University of Helsinki; Mikko
Tolonen, University of Helsinki; Lidia Pivovatova, University of Helsinki; Yann Ryan,
Digital Humanities and postdoctoral researcher, Leiden University
Abstract
[en]
This study uses large-scale computational analysis to trace the reuse of decorative
headpieces in eighteenth-century books. The results highlight how image variants reveal
complex networks of printers and publishers beyond simple one-to-one ownership.
first name(s) family name,
first name(s) family name,
first name(s) family name,
first name(s) family name,
first name(s) family name,
first name(s) family name,
first name(s) family name,
first name(s) family name,
Articles
Article title!first name(s) family name,
DHQ Test Article: BeastiaryJulia Flanders, Northeastern University; John A. Walsh, Indiana University
Test Article Demonstrating RevisionNote: RevisedDHQ, None
Abstract
[en]
This test article is based on Belinda Barnet, “Machine Enhanced (Re)minding: the Development
of Storyspace”. This article traces the history of Storyspace, the world’s first program
for creating, editing and reading hypertext fiction. Storyspace is crucial to the
history of hypertext as well as the history of interactive fiction. It argues that
Storyspace was built around a topographic metaphor and that it attempts to model human
associative memory. The article is based on interviews with key hypertext pioneers
as well as documents created at the time.
[fr] Article Expérimentale: En Français[en] Test Article: French-Language Article with Stub Translation[es] Artículo de prueba: artículo en francés con traducción de esbozoDHQ, None
Abstract
[fr][en][es]
This test article is based on Jean-Guy Meunier, «Le texte numérique : enjeux herméneutiques».
La numérisation des textes est omniprésente dans les humanités numériques. Elle
semble se présenter uniquement comme une modification du support matériel : du
texte sur papier au texte numérique. Mais elle fait plus que cela. La
numérisation touche aussi le texte en tant qu’objet sémiotique. Or, les
multiples opérations de cette technologie mettent en œuvre des décisions
interprétatives qui ne sont pas sans affecter le texte sémiotique, c’est-à-dire
celui qui se donne à lire et à analyser. En ce sens, la numérisation des textes
n’est pas neutre. Elle est un moment important d’une herméneutique matérielle.
This test article is based on Jean-Guy Meunier, “Le texte numérique : enjeux herméneutiques”.
The digitization of texts is omnipresent in the digital humanities. It seems to
present itself only as a modification of the material medium: from text on
paper to digital text. But it does more than that. Digitization also affects
the text as a semiotic object. The multiple operations of this technology
implement interpretative decisions that are not without their effects on the
semiotic text; that is to say, the text that offers itself for reading and
analysis. In this sense, the digitization of texts is not neutral. It is an
important moment of material hermeneutics.
Este artículo de prueba se basa en Jean-Guy Meunier, “Le texte numérique : enjeux
herméneutiques”. La digitalización de textos es omnipresente en las humanidades digitales.
Parece presentarse solo como una modificación del soporte material: del texto en papel
al texto digital. Pero va más allá. La digitalización también afecta al texto como
objeto semiótico. Las múltiples operaciones de esta tecnología implementan decisiones
interpretativas que no dejan de tener efectos en el texto semiótico; es decir, el
texto que se ofrece para la lectura y el análisis. En este sentido, la digitalización
de textos no es neutral. Es un momento importante de la hermenéutica material.
[en] Test Article: Foreign Language with Full Translation[fr] Bienvenue à Digital Humanities QuarterlyJulia Flanders, Brown University; Wendell Piez, Mulberry Technologies, Inc.; Melissa
Terras, University College London
Abstract
[en][fr]
A welcome to DHQ from the editors, with a brief summary of the journal's
development and goals.
Un accueil à DHQ des éditeurs, avec un bref résumé de la revue
Le développement et les objectifs.
Media Encoding SamplesJulia Flanders, Northeastern University
Abstract
[en]
Sample media encoding
Math Encoding SampleJulia Flanders, Foo
Abstract
[en]
Sample MathJax encoding
Sample Field ReportJulia Flanders, Foo
Abstract
[en]
Sample MathJax encoding
[en] An Article Title[it] Il Titolo dell'ArticoloBenjamin Grey, Digital Humanities Quarterly; Josiah Carberry, Psychoceramics Department,
Brown University; Jean-Baptiste Botul, Association of the Friends of Jean-Baptiste
Botul
Abstract
[en]
Welcome to the DHQ sample article. In this document, you should be able to find
properly formatted examples of
nearly every potential article element allowed for DHQ articles. For documentation
on when or how to use these
elements, consult the Encoding Documentation Wiki, available at: https://github.com/Digital-Humanities-Quarterly/dhq-journal/wiki/DHQ-Encoding-Documentation
To see what
most of these elements look like when passed through DHQ CSS, import the DHQ_Proof
transformation scenario and open
it in a web browser.