DHQ: Digital Humanities Quarterly
Editorial

Paragraphs: A simple series of paragraphs.

The language under discussion here is called Comic Book Markup Language in part to highlight the book-ness and bookishness of these documents, their material properties and bibliographic characteristics. Graphic narratives typically manifest as “books,” stapled or otherwise bound leaves, perhaps thirty-six pages, with an interesting and complex structure, incorporating the graphic narrative — the sequential art and text or “comics” content — alongside a rich assortment of paratexts: advertisements, fan mail, and so on. The emphasis on both “comics” and “books” in the title of the language signals an awareness of the full range of content in the material artifact and the integration of comics content with related paratextual content. The material properties of the book — the codex form, the leaves and pages, the physical properties of the paper — are inseparable from the structure, pacing, and design of the narrative. Certainly less page-bound organizational and compositional frameworks are possible, such as the newspaper strips with long-running narrative arcs, in which the daily “strip” of three or four panels is the basic structural unit.
The traditional grouping of panels into deliberately composed groups (often corresponding to the physical page or the “strip” of a newspaper daily) is being challenged by changing publishing and reading technologies. iPhones and other smartphones have become popular devices for reading newly published comics as well as “reprints” of older comics. However, a full-page grid of panels is not easily readable on the smaller screen of the typical smartphone, so the software interfaces on such devices focus on a single panel at a time. The deliberate juxtaposition of graphic and textual elements in the original composition is shattered by the interface requirements and limitations of the reading device. As new comics work is increasingly targeted at such digital platforms, the traditional grouping of panels into compositional units resembling pages may be abandoned for new compositional strategies. Larger format devices like the iPad and other tablets are better able to represent full-page compositions of panels while also allowing zooming in to focus on individual panels. The isolation of a panel from its surrounding context is not easily achieved in print media. While the reader's eyes and attention may focus on a single panel at a time, other panels on the page remain in the reader's field of vision. The migration of comics content to digital reading devices, and the structural and aesthetic implications of that migration, call attention to the impact of the material characteristics of the comic book document.
Comic books are often very formally self-conscious documents and express a fascination with their own bibliographic identities — creators (comic book writers and artists) become characters in the narrative, editorial notes refer to episodes from prior issues, comic books and paratextual elements, such as advertisements, are parodied within the comic book narrative, publication milestones (such as the first, fiftieth, or one hundredth issue are highlighted and celebrated. These literary, rhetorical, and commercial moves point to a self-awareness of the comic book as document and bibliographic object.
CBML is intended primarily for representing, modeling, and analyzing twentieth- and twenty-first-century comic books, daily comic strips, longer narratives or “graphic novels,” and Web comics and other comics content published on digital platforms, such as smartphones and tablets. CBML may also serve as a possible solution for encoding certain documents we might not normally characterize as comics or comic books, but which share many formal characteristics with comics. In his influential Understanding Comics, Scott McCloud's definition of comics encompasses Hogarth's narrative picture series, the Bayeux Tapestry, and pre-Columbian picture writing as found in the Codex Zouche-Nuttall [McCloud 1993, 10–17].

Block quotes

This study provides an introduction and rationale for the development of Comic Book Markup Language, or CBML, an XML[1] vocabulary for encoding multiform documents that are variously called comics, comic books, and “graphic novels” [2] as well as other documents that integrate comics content[3] or that share formal features with comics content. A markup language is a set of machine-readable textual codes, or “tags,” that are used to identify structure, semantics, and other features of documents and data.[4] The application of these codes to a document is typically a necessary stage, often the most crucial and informative stage, of editing, analyzing, indexing publishing, visualizing, and otherwise studying or manipulating texts in digital environments. The act of encoding a document is a form of discovery, or prospecting, in which the encoder maps a document's structure, identifies semantic elements of interest, and documents relationships internal and external to the document. Scholarly encoding is a form of both reading and writing. The reading, shaped by the constraints of a markup language, is inscribed upon and embedded within the digital text. As literary scholar and digital humanist Jerome McGann has noted, “When you mark up a text you are ipso facto reading and interpreting it. A … text marked up in TEI [a scholarly encoding language] has been subjected to a certain kind of interpretation”  [McGann 2001, 143]. Sperberg-McQueen, Huitfeldt, and Renear assert that markup is constitutive of meaning, markup is interpretive, markup is performative, markup acknowledges or licenses inferences about the text:

Markup is inserted into textual material not at random, but to convey some meaning.

An author may supply markup as part of the act of composing a text; in this case the markup expresses the author’s intentions, e.g. as to the structure or appearance of the text. The author creates a section heading, for example, by creating an appropriate element in the document; the content of that element is a section heading because the author says so, and the markup is simply the method by which the author says so. The markup, that is, has performative significance.

In other cases, markup is supplied as part of the transcription in electronic form of pre-existing material. In such cases, markup reflects the understanding of the text held by the transcriber; we say that the markup expresses a claim about the text. The transcriber identifies a section heading in the pre-existing text by transcribing it and tagging it as a section heading; the content of that element is a section heading if the transcriber’s interpretation is correct, but other interpreters might disagree; it is plausible to imagine discussions over whether a given way of marking up a text is correct or incorrect.[5]

In the one case, markup is constitutive of the meaning; in the other, it is interpretive. In each case, the reader may legitimately use the markup to make inferences about the structure and properties of the text. For this reason, we say that markup licenses certain inferences about the text.  [Sperberg-McQueen, Huitfeldt, and Renear 2000, 11]

Julia Flanders' discussion of scholarly text encoding privileges the role of the researcher/encoder and likens the encoded text to the scholarly article:

Perhaps we need to look to the pleasure of mutability. To recuperate XML, politically and aesthetically, we should be looking not to the paradigms of XML usage that arise from librarianship and from industry-level ideas of the separability of form and content, but rather to paradigms of performance of a different kind. By shifting our view we can understand XML as a way of expressing perspectival understandings of the text: not as a way of capturing what is timeless and essential, but as a way of inscribing our own changeable will on the text — in other words, as a form of reading. Seen this way, XML's presentational flexibility derives not from a separation of presentation and content, but rather from the shifting vantage points from which the text appears to us, the shifting relationships that constrain our understanding of it, the adaptability and strategic positioning of our own readerly motivations.

Ironically, this is a view which emerges most clearly at the margins of current digital text practice. It is not visible in the large digital library projects, whose workflow has come to resemble an industrial operation complete with offshore outsourcing, detailed division of labor, reliance on automation and robotics, and an emphasis, in the output, on uniformity and quantity (thankfully planned obsolescence has not yet become part of the strategy). But we can find it in the small projects designed by individual faculty, typically in conjunction with their teaching, to create digital versions of individual texts which serve as readings: often idiosyncratic, unscalable, representing private insight. They function more like an article than an archive, as a local, contingent expression of insight. [Flanders 2005, 60–61]

The development of a markup language that can support such scholarly reading and interpretation necessitates a careful study and analysis of the content, structure, and semantics of the class or classes of documents for which the language is designed — in this case, the comic book.
CBML is based on the Text Encoding Initiative P5: Guidelines for Electronic Text Encoding and Interchange . The TEI Guidelines are a mature conceptual model for digital representation of multitudinous and disparate document types: inscriptions and papyri; illuminated manuscripts; authorial holograph manuscripts; correspondence; printed books of prose, verse, and drama; critical and scholarly editions; born-digital documents; and more. The TEI Guidelines

make recommendations about suitable ways of representing those features of textual resources which need to be identified explicitly in order to facilitate processing by computer programs. In particular, they specify a set of markers (or tags) which may be inserted in the electronic representation of the text, in order to mark the text structure and other features of interest.  [TEI 2010c]

The TEI Guidelines are widely used in the digital humanities and academic library communities and are maintained by the TEI Consortium, an international body modules, including modules for general categories of documents, such as prose, drama, verse, and dictionaries.[6] The Guidelines also provide additional modules that address more specific textual features and metadata requirements, such as names and dates, manuscript description, linking, textual criticism, and so on. And in their most recent incarnation, the Guidelines provide elements and attributes for linking transcriptions to facsimile page images. This latter feature is especially useful for encoding comics and other graphics-intensive works. From these many available modules, one selects a subset that meets the needs of a particular document, project, collection, or analytical approach. The TEI Guidelines are extremely flexible, providing a vocabulary and mechanisms for encoding and describing a rich diversity document types. However, recognizing that not every document type and representational requirement may be anticipated, the Guidelines provide a well-documented system for customizing and extending the provided tag set with new and modified elements and attributes.[7] TEI, as delivered by the TEI Consortium, is remarkably well-suited to encoding many aspects of comic books; nevertheless, conceptual clarity and practical benefits may be gained from some modest modifications and additions to the stock TEI Guidelines. Hence CBML, a TEI customization with elements and attributes for encoding many of the structures and features found in comic book documents.

Tables

Tables with lists in cells:
Ten page paper
  • Clear thesis, correct citations, good use of primary and secondary sources and examples, good mechanics, well-organized.
Based on a travel narrative and course themes
  • Extensive and creative engagement with a narrative.
  • Citation and discussion of assigned readings and course themes.
Focused on geography
  • A thesis and analysis that reflects the traveler’s movement through a particular human and physical terrain at a particular time.
Table 1. 
153 persistent features in Male-authored documents: 1, a, abord, action, affaire, ajouta, amie, article, au, aura, auteur, autour, autre, aux, avons, bas, bouche, bras, c, capitaine, cent, chacun, chair, champ, charles, chez, christ, ciel, cinq, comment, comtesse, contre, corps, coup, coups, crime, côté, d', des, deux, diable, dis, docteur, doigts, dont, doute, droite, du, entre, est, face, fait, façon, femme, feu, fin, fit, fois, foule, gens, gros, haut, histoire, homme, hé, hôtel, ils, in, jacques, jean, juge, jusqu', la, laquelle, le, les, leurs, ligne, long, lorsque, main, mains, maîtresse, messieurs, mis, mit, moins, monseigneur, monsieur, montre, mot, même, nez, nom, nombre, nos, oeil, oeuvres, ordre, oreille, ou, oui, où, par, passage, pied, pieds, présente, président, prêtre, quatre, quelqu', quelque, quelques, question, qui, quoi, reprit, reste, rue, récit, saint, saints, salut, sang, second, seconde, selon, ses, seulement, simple, sire, soit, sous, sur, table, tirer, tour, toute, trente, trois, un, v, ventre, vers, vieux, village, vin, vingt, voici, y, yeux, à
192 persistent features in Female-authored documents: 192 persistent features in Female-authored documents: absence, admiration, afin, agréable, ai, aimable, aime, aimer, aller, amitié, amour, anglais, angleterre, auguste, auprès, aurais, avais, avait, avec, avez, avoir, beaucoup, belle, bien, bonheur, bonne, brillante, but, cacher, car, caractère, celle, chagrin, chercher, chère, coeur, comprendre, compte, comte, confiance, conserver, cour, crois, destinée, disant, donner, douceur, douleur, doux, elle, elles, empêcher, encore, enfance, enfant, enfants, entièrement, envie, esprit, espérance, estime, eût, faisait, fallait, faut, fièvre, fleurs, france, frère, fût, gloire, goût, grande, grandes, généreux, henri, hiver, ici, il, imagination, impossible, inquiétude, inspire, inspirer, instant, intérêt, jamais, jardin, jours, liberté, lui, lumières, m, ma, mais, malgré, manière, manières, me, moi, mon, montrer, mère, ne, ni, nécessaire, opinion, parce, parler, parlez, passion, pauvre, pays, personne, personnes, petite, peut, peuvent, plaire, plaisir, pleurs, plusieurs, possible, pourquoi, pourrais, pouvait, prince, princes, princesse, pu, puisque, puissance, père, quand, que, quitter, regarder, reine, repos, retrouver, revenir, roi, sais, sait, sans, savoir, secret, sentiment, sentir, seule, si, son, souffrir, souvenir, souvent, soyez, suis, supporter, surprise, tant, toi, toujours, tous, toutes, trop, trouva, trouver, très, tu, utile, veux, vie, vit, vivre, voir, vois, vos, votre, voulait, voulut, vous, voyage, voyant, véritable, âme, éducation, égard, égards, émotion, épouser, était, êtes
Table 2. 
Features appearing in the top 500 highest-weighted in both time range models
Within the male and female lists, it is possible to identify a number of interesting semantic groupings of words. Reassuringly, the female pronouns and negative polarity items and male quantifiers discussed earlier are still present. In addition, there are a number of other semantic categories of words that appear to cohere:
Enduring Male Terms Enduring Female Terms
  • Quantifiers: quelqu', quelque(s)
  • Religiosity: christ, ciel, corps, diable, saint(s), saints, sang(?)
  • Numericality: 1, cinq, cent, deux, nombre, quatre, second(e), trois, trente, un, vingt
  • Anatomy: bouche, bras, chair, corps, doigts, face(?), main, nez, pied(s), oeil, oreille, sang, yeux, ventre
  • Authority: capitaine, docteur, juge, président, sire
  • Other notables: action, amie, femme, feu, histoire, homme, maîtresse, rue, salut, vieux, village, vin
  • Pronouns: me, moi, mon, vos, votre, vous
  • Spirituality: âme, chercher, coeur, destinée, espérance, esprit, imagination, inspire, inspirer, passion
  • Quantifiers: tous, toutes, (toujours)
  • Emotion: agréable, aimable, aime, aimer, amitié, amour, bonheur, douceur, douleur, doux, émotion, envie, espérance, plaire, plaisir, pleurs, sentiment, sentir, seule
  • Family: enfant(s), épouser, frère, mère, père
  • Nobility: prince(s), princesse, reine, roi
  • Negatives: impossible, ne, ni, pas, personne, sans
  • Other notables: éducation, impossible, inquiétude, gloire, liberté, lumières, opinion, pauvre, possible, puissance, quitter, sais, sait, savoir, secret, seule, souffrir, souvenir, supporter, surprise, vivre, voyage, voyant, voulait, voulut
Table 3. 
Subjective thematic groups among the persistent features
Airships
Explosions
Aircraft Accidents
Hindenburg (Airship)
1934-1956 (approx.)
Table 4. 
Professional Metadata for Figure 9. [A simple table]
Folksonomic Metadata Score (Voorbij and Kipp Scale) Notes
Hindenburg (Airship) 1 exact match to “Hindenburg (Airship)”
Hindenburg 2 synonym for “Hindenburg (Airship)”
Accidents 3 broader term of “Aircraft accidents”
Zeppelin 4 narrower term of “Airships”
Flames 5 Present in photograph; related to “Explosions”
Painting 6 this is a photograph
omgreadlater 7 junk tag
Table 5. 
Folksonomic Metadata and Scores from Voorbij and Kipp Scale for Figure 9.
Folksonomic Metadata Score (Voorbij and Kipp Scale) Notes
Hindenburg (Airship) 1 exact match to “Hindenburg (Airship)”
Hindenburg 2 synonym for “Hindenburg (Airship)”
Accidents 3 broader term of “Aircraft accidents”
Zeppelin 4 narrower term of “Airships”
Flames 5 Present in photograph; related to “Explosions”
Painting 6 this is a photograph
omgreadlater 7 junk tag
Table 6. 
Folksonomic Metadata and Scores from Voorbij and Kipp Scale for Figure 9. [with control over column widths]
This table illustrates column spanning:
Suffix Trie Suffix Tree DAWG CDAWG SCDAWG
abcbc\(abcab\): 12 Bytes; 12 symbols
Nodes 66 17 16 6 6
Edges 65 16 23 13 21
OCR page: 7 KB; 6.945 symbols
Nodes 21.418.626 7.355 11.995 1.163 1.163
Edges 21.418.625 7.354 14.915 4.083 8.094
Excerpt EU-Corpus: 106 KB; ca. 106.000 symbols
Nodes - 161.001 165.962 25.273 25.273
Edges - 161.000 227.515 86.826 170.358
Small Corpus: 1.2 MB; ca. 1.250.000 symbols
Nodes - 1.921.704 1.922.811 366.070 366.070
Edges - 1.921.703 2.730.597 1.173.856 2.355.669
Table 7. 
Comparing the size of distinct index structures for four input texts.
This table illustrates row spanning and text alignment:
Suffix Trie Suffix Tree DAWG CDAWG SCDAWG
abcbc\(abcab\): 12 Bytes; 12 symbols
Right-aligned 66 17 16 6 6
65 16 23 13 21
OCR page: 7 KB; 6.945 symbols
Left-aligned 21.418.626 7.355 11.995 1.163 1.163
21.418.625 7.354 14.915 4.083 8.094
Excerpt EU-Corpus: 106 KB; ca. 106.000 symbols
Center-aligned - 161.001 165.962 25.273 25.273
- 161.000 227.515 86.826 170.358
Small Corpus: 1.2 MB; ca. 1.250.000 symbols
Nodes (bottom) - 1.921.704 1.922.811 366.070 366.070
- 1.921.703 2.730.597 1.173.856 2.355.669
Table 8. 
Comparing the size of distinct index structures for four input texts.

Lists

Unordered Lists

1.4 Do you work in teams to undertake your research?
  • Yes
  • No
1.5 How often do you work in teams to undertake your research?
  • I usually research with a team
  • I sometimes research with a team
  • I occasionally research with a team
  • I rarely research with a team
1.6 The teams that I research with consist of (Check all that apply)
  • Designers
  • Colleagues in my discipline
  • Colleagues from other disciplines
  • Software developers
  • Content specialists
  • Librarians
  • Computer Scientists
  • Students
  • Other (please list)

Ordered Lists

  1. Our first recommendation would be to secure all promised support in writing. Even though a current administrator is well disposed toward your innovation and understands the commitment it requires of all those involved, administrators change. A written commitment will ensure that new incumbents in any office understand that a commitment was made, by their office and not just by the previous occupant, to your ideas, your personnel, and the results you seek to achieve.
  2. Develop and articulate terms of evaluation that will enable your administration to see and measure your success. This can result in positive impacts on individual applications for renewal, tenure and promotion, as well as on the innovation itself. It will also provide you with material to convince new students of the value of adding your classes to their timetables.
  3. Make explicit your intention to let your innovation wither on the vine if your administration is not willing to commit, in writing, to its continuation beyond an initial trial period (probably that of an initial grant). This way, you do not fail to at least try to innovate, but you also do not commit yourself or your institutional “home” to support that neither you nor they can afford in the absence of administrative responsibility.
  4. Those who would get involved in innovation must know explicitly how their involvement will count toward renewal, tenure and promotion. A good model might be one in which key innovators have a written statement from whoever has the authority to offer it indicating that involvement in the innovation will count, for example, as the equivalent of a peer reviewed publication.
  5. We found in developing and implementing the multi-disciplinary course we placed at the core of the HHC experience that through administrative and financial imagination it is possible to overcome seemingly insurmountable professional and financial problems. We were able to implement our core course for less than the cost of a single replacement hire on a contractually limited term basis, despite the fact that we had seven university employees from six different campus units involved.
  6. Technical support and information literacy are now integral parts of a meaningful humanities experience. By fully integrating technical support and information literacy into our courses, we have been able to offer students marketable computer skills and to incur in them a healthier and more informed attitude to new forms of media production and consumption.
  7. Students should be able to see that real scholarship is available to them to produce, and they can become participants in a scholarly field through the production of well-composed and thoroughly researched work. AhHa!, our data management system, made it easy for us to track students’ progress through the “program,” and to let them see how they compared to their peers. The perennially asked “but what do you want?” questions evaporated when students could see how others had addressed themselves to assignments, and in fact the quality of student work improved as subsequent classes could see what had been done before. Students were no longer working in the artificial vacuum of individualistic student-scholarship, where “real” scholarship is defined as that produced by unknown names followed by the three mysterious letters, PhD. Instead, students were able to see that real scholarship was available to them to produce as much as to read, and that they could become participants in a scholarly field not through some unexplained and occult rite that their professors had undergone at some specific place definable only as “not here,” but through the production of well-composed and thoroughly researched work such as that of which they were demonstrably capable.
  1. We represent textual variation with the <app> tag and indicate the sequence of the author’s corrections with the varseq attribute;
  2. We contain the textual variation within the element <seg type="l">, representing the reconstruction of the verse unit;
  3. We avoid the redundancy of empty elements by specifying which unit undergoes textual variation.
  1. Who are your collaborators?
    1. What community is your research accountable to beyond your academic community?
    2. How will you demonstrate your desire to be accountable to them?
    3. Are there people you can talk to about the impact of your research beyond the IRB?
  2. How does everyone benefit from the research?
    1. What questions does the community want answered?
    2. Can people be compensated in ways that honor their time and skills?
Create
  1. What tools and or methods encourage multidirectional collaboration?
    1. What mechanism of accountability can you create?
    2. Are there ways that collaborators can use the research process to their own ends?
  2. What kind of process can you create for your research?
    1. Is there room for collaborators to give and rescind consent at different times during the research process?
    2. Does the pace of the project meet your needs and your collaborators needs?
Transform
  1. How will you take care of yourself in the research process?
    1. What do you and your collaborators need to stay sustained while conducting the research?
    2. What happens after the research product is complete?
  2. How will you be transformed?
    1. Will the research strengthen your connection to your collaborators?
    2. Did you and your collaborators come to new understandings?

Gloss Lists

Apocalyptic: A term that is used almost as loosely within biblical scholarship as outside it; etymologically the word is derived from the Greek for “revelation” or “unveiling” and some scholars would wish to restrict its use to a type of literature in which heavenly secrets are revealed to the seer by a vision, a heavenly journey, or the words of an angel (or some combination of all three). All too often, however, the term is used as if it meant eschatological, and for want of a better term it is often used to describe a particular type of eschatology in which ordinary human history is expected to be interrupted by a catastrophic divine intervention in the near future (the reason being that such eschatology is frequently expressed through the medium of apocalyptic in the first sense).

Byte Code: The name often given to the intermediate product of compilation; a byte code file can typically be run on an interpreter or virtual machine rather than directly by a computer’s operating system.

Code: Without further qualification this normally refers to “Source Code”, the set of instructions the author/programmer writes in whichever programming language he or she is writing to tell his or her program (in this case, work of IF) how to behave.

Compiler: A piece of software to translate human readable source code into machine readable form. Compilation may be to a native executable (a file that can be executed by a particular operating system without further ado) or, more usually with Interactive Fiction, to some intermediate (byte code) form that can be run by different interpreters on different operating systems.

Containment Hierarchy: The system of containment that determines which objects are within which other objects. In this context “within” includes inside, on top of, underneath or behind. At the top of the containment hierarchy stand the rooms; rooms are regarded as not being contained by anything but as directly or indirectly containing everything else (except for objects temporarily out of play and so outside the map althogether).

Debugger: A tool which helps a programmer (or in this case, game author) track down programming errors or “bugs”. For example TADS 3 Workbench for Windows incorporates a highly sophisticated debugger that allows the programmer to step through his code line by line to see what it is doing each step of the way (and so determine where something is going wrong if it is going wrong).

Domain-Specific Language: A computer programming language (such as Inform or TADS) specifically tailored to a particular type of task (such as writing Interactive Fiction).

Eschatology: Anything that deals with the last things, the end of the age or the consummation of history (from the Greek eschatos, meaning “last”). In Christian theology eschatology traditionally deals with matters such as Heaven, Hell, the Second Coming and the Last Judgement; in a first-century context it is better thought of in terms of the coming of the kingdom of God, the age to come in which the evils of the present age would be overcome.

IDE (Integrated Development Environment): A single piece of software containing all (or at least) most of the tools a game author needs in order to write a game, including an editor, a compiler, a debugger and an interpreter on which the game may be tested.

Inform: The most popular language/system for authoring Interactive Fiction, written and maintained by Graham Nelson. Two versions of Inform are currently in use. Inform 6 (the older version) resembles a conventional programming language. Inform 7 adopts a more “natural language” style of programming with the aim of making the process of writing Interactive Fiction more like writing prose and less like conventional programming.

Implicit Action: An action automatically carried out by the parser in order to facilitate the command the player actually typed; for example if the player types GO THROUGH RED DOOR when the red door is locked but the player has the key, a well-behaved parser would carry out implicit UNLOCK RED DOOR and OPEN RED DOOR actions rather than forcing the player to type these commands explicitly.

Inventory: The items currently carried by the Player Character.

Interpreter: A piece of software that (typically) implements a Virtual Machine and so allows a game compiled to byte code to be run on the player’s computer; one generally needs an interpreter to run a work of Interactive Fiction in the same way that would need a copy of MS-Word to read a Word file, or a program such as Windows Media Player to watch a DVD on a computer. The advantage is that an interpreter needs to be written only once for each operating system (Windows, MAC-OS, Linux, etc.) and can then play all the games written for that interpreter.

Library: A set of supporting routines supplied with an IF authoring system such as TADS or Inform that handles most of the tasks common to most works of Interactive Fiction; a library will typically implement a parser and at least a basic world model, and will generally be written in the same language as that supplied with the authoring system to allow ready customization by game authors.

Map: The totality of rooms within a particular work of Interactive Fiction, together with the interconnections between them.

Menu: A set of options presented to a computer user from which he or she is meant to select one.

Non Player Character (or NPC): Any animate character (human, animal, alien or robotic) that appears in a work of Interactive Fiction apart from the Player Character; characters whose actions are not controlled by the Player Character. Usually, the term is restricted to characters who are actually implemented and can be interacted with rather than characters who are only mentioned or who never remain on stage long enough for the Player Character to interact with.

Object: Something physical that the Player Character can interact with (or at least examine) in the course of the game, such as a table, chair, pen, or book (or, indeed, another character). Some objects may be quasi-physical (such as smoke or sunlight). In many IF authoring systems based on OOP (object-oriented programming) design, “object” may also refer to a programming object; in such systems a game object (in the first sense) will usually be represented by a programming object (the second sense), but programming objects may also be used for other purposes (if you don’t know anything about OOP, this second sense of “object” is probably irrelevant to you).

Parser: That part of the software in a work in the Interactive Fiction that interprets the player’s command and translates it into an action that the program can perform (or else complain if translation is impossible or ambiguous).

Player: The flesh-and-blood human being typing the commands at a keyboard and reading the game’s responses on screen.

Player Character (or PC): The protagonist of a work of Interactive Fiction, whose actions are controlled (or at least guided) by the player, and through whose eyes and ears the fictional world is described to the player.

Room: A (normally discrete) unit of space within the game’s map. The usual convention is that the Player Character can interact with anything that is in the same room as himself or herself (unless it is deliberately concealed), and so a room is effectively that volume of space that is immediately present to the Player Character. A room may be a room in the ordinary sense of a room within a building, but it may also be a section of space outdoors, such as a meadow, the top of a hill, or a section of street.

Scope: Roughly speaking, the set of objects with which the Player Character can currently interact, given the nature of the action he or she is meant to be carrying out. Normally this will be restricted to the objects the PC can see or touch, but this is not always the case. If the action is conversational, e.g. ASK ACTOR ABOUT TOPIC, then any object (or topic) is potentially in scope. If the author has implemented a GO TO ROOM command to take the Player Character to anywhere on the map, then any (known) room will be in scope.

Spoiler: A piece of information (typically in a review) which reveals too much to someone who has yet to play the game, and so spoils the experience of playing it for the first time.

TADS: An acronym for Text Adventure Development System, one of the two major languages/systems in which amateur Interactive Fiction is currently written. TADS 2 is still in use, but the latest incarnation (in which “All Hope Abandon” was written) is TADS 3. TADS was written by and is maintained by Mike Roberts.

Virtual Machine (or VM): technically, a computing device implemented in software rather than as a piece of hardware (hence “virtual”); a piece of software that can execute byte code written for it, usually implemented as an interpreter for Interactive Fiction.

World Model: That part of a work of an IF that defines the behaviour of the story-world in which the action takes place (such as movement around the map, the containment hierarchy, and some basic physics); typically, much of a game’s world model is supplied by the library of the authoring system used.

Z-Machine: The Virtual Machine for which the Infocom games were written, and to which games written in Inform compile.

Hardware obsolescence: The original console or computing platform used to run the game may cease to be supported or even available in the aftermarket.
Software obsolescence: The original software needed to run the game — operating system, drivers, frameworks — may lose support, cease development, and become incapable of running on future hardware/software configurations.
Scarcity: Some video games are produced in limited quantities, and are subject to the dangers of media decay.[8]
3rd party software dependence: Once a game platform becomes obsolete, emulation may be the best method of providing access. Currently, however, most emulators are developed by the game community and are of questionable legality. They are also typically created without the benefit of the original specifications, and are themselves at risk of becoming obsolete.
Complex, proprietary code and an associated lack of documentation: Videogames are generally released as compiled binaries with no documentation of the compiling process, or even the programming languages used. Not having access to the source code or language specifications makes migrating or emulating software far more difficult. It’s a magnification of the format identification issue for stand-alone, static files.
Authenticity: The elephant in the digital preservation room — proving that a digital object is what it claims to be, free from tampering or corruption. Video games enjoy many versions between the first prototype, the official release (on multiple platforms), and cracked or otherwise altered unauthorized editions. Especially for older games, the only extant copy may exist in a fan-run web repository, making the authenticity impossible to establish.[9]
Intellectual property rights: The game development industry is highly creative and competitive, leading developers to be very jealous of their intellectual property. Most have instituted extremely draconian shrink-wrap licenses reflecting this. And yet, once a game is no longer actively marketable, they are unlikely to respond to inquiries about licensing for it.[10]
Significant properties: What are the significant properties of a game that must be maintained with each transformation/preservation action? How important are font size and color palette? What about the speed of text scrolling or sprite movement? How faithful must we stay to the original code? Is it enough to save a video of gameplay, or must we save the interactivity? Do we need BOTH? These are essential to define, as they play a major role in determining authenticity.[11]
Context: Although this isn’t an immediate threat to the preservation of games, building contextuality is important to creating understanding for future users. This is truer for video games than many other record types because, as technology advances, game players who have only been exposed to the latest and greatest may be apt to play an older game and say, “so what?” even though the game might have been revolutionary for its time.[12]

Figures

The 10th Century Venetus A MS of Homer
Figure 1. 
The 10th Century Venetus A MS of Homer: U4 (Allen): Marcianus Graecus Z. 458 (= 841) - the back (verso) of folio 15 (available under a Creative Commons license from Harvard’s Center for Hellenic Studies: http://​chs.harvard.edu/​chs/​manuscript_images) The knowledge based OCR project recommended in this report would allow us to work with manuscripts as well as printed materials.
Detail of the Venetus A showing scholia and text
Figure 2. 
Detail of the Venetus A showing scholia and text
Figure 3. 
Two pages from The Fantastic Four #51. This example contains many of the familiar elements of comics: panels, pictures, narrative captions, word balloons, sound effects, motion lines, etc. [Lee and Kirby 1966a]

Code Examples

<eg>

This first example of CBML uses <eg> with internal character-level escaping of angle brackets:
<cbml:panel ana="#action-to-action" characters="#cap #anon_man" n="5" xml:id="eg_000"> 
    <cbml:caption>
        Cap acts quickly to tranquilize the gun-happy pedestrian... 
    </cbml:caption> 
    <cbml:balloon type="speech" who="#cap" xml:id="eg_007"> 
        A little <emph rendition="#b">sleep</emph> will do wonders for you!
    </cbml:balloon> 
    <sound> SPLAT! </sound> 
    <cbml:balloon type="speech" who="#anon_man"> Ugh! </cbml:balloon>
</cbml:panel> 
This second example also uses character-level escaping:
Rule 7 introduces the lookup strategy, by which particular information is obtained by reference to a special-purpose node. It directs a query such as f a c to the AEE node to convert the a to ē, so the perfective stem of facere becomes fēc. Here is that lookup node:
AEE: 
    1 <$letter#1 a $letter#2> == $letter#1 ē $letter#2      
                         
This node depends on a definition (not shown) that defines what atoms are in the category “letter.” Rule 1 says that any query beginning with a letter, then the atom a, then any letter, should evaluate to the two letters surrounding ē instead.
This example includes a CDATA marked section within <eg>to escape an entire XML code sample:
<journal vol="15" issue="4">
      <!-- Preview date: November 11, 2021
           Published date: February 2, 2022
      -->
      <title>2021</title>
      <list id="articles">
         <item id="000574"/>
         <!-- Chastang; OJS 1056 -->
         <item id="000566"/>
         <!-- Weidman and Pastor; OJS 1018 -->
         <item id="000578"/>
         <!-- Lee; OJS 1120; -->
         <item id="000579"/>
         <!-- Ketzan; OJS 1065; -->
         <item id="000581"/>
         <!-- Grieco; OJS 1169; -->
         <item id="000582"/>
         <!-- Figueras; OJS 1218 -->
         <item id="000597"/>
         <!-- Bonn; OJS 975 -->
      </list>
      <list id="reviews">
         <item id="000575"/>
         <!-- Barnett; OJS 1282; -->
      </list>


</journal>    
This example uses <eg> to achieve formatting control, without a CDATA section but with <hi> for formatting details such as bold:

働	く	べ	き	に	働	か	う	と	す	る
hatara	ku	be	ki	ni	hatarah	ko	u	to	su	ru
    
candidate rule #1: う → こう   
candidate rule #2: 働 → 働    
"其	れ	で	寧	ろ	小	黨*	分	立	で	行	く	所	ま	で	行	く	が*	よ*	い*"
so	re	de	mushi	ro	shou	tou*	bun	ritsu	de	i	ku	tokoro	ma	de	i	ku	ga*	yo*	i*
                     
黨 →  党 (global rule), がよい →  が良い (hiragana + * rule)
{'黨': [(6, 6)], 'がよい': [(17, 19)]}
                     
其   れ   で   寧  ろ  小  党*  分  立  で  行  く  所  ま  で  行  く  が*  良*  い"*
so re de mushi ro shou  tou*   bun   ritsu de i  ku tokoro   ma de i  ku ga* yo* i*
                     
                     
                     
                     
This example uses <eg> to control the formatting of a code sample but without escaping any characters:
var svg = d3.select('#graph') .append('svg') .style('width', 2000)
                         .style('height',1000);

<code>

  1. We represent textual variation with the <app> tag and indicate the sequence of the author’s corrections with the varseq attribute;
  2. We contain the textual variation within the element <seg type="l">, representing the reconstruction of the verse unit;
  3. We avoid the redundancy of empty elements by specifying which unit undergoes textual variation.

Audio-visual media

[Audio files are currently provided using a simple link encoded as <ref> (which downloads an MP3 file).] Click for the accompanying audio interview.
[Our preferred future encoding uses the <media> element which would play the audio file from the browser, and also supports an alt text for accessibility.] Click for the accompanying interview:
[Video files are embedded using the <media> element.] Simon Buckingham Shum, Open University, and member of the Memetic VRE project, demonstrates Compendium as a dialogue mapping tool for use in the creative and performing arts. At the 14-minute mark, Ale Fernandez of Orchestra Cube speaks. He introduces Orchestra Cube and his conducting exercise, which uses a range of notational symbols streamed over the network. He discusses how dancer Kyra Norman, who is located in the second grid location (shown by videos with the 2cam suffix), may respond to both the streamed notation and the Orchestra's performance. Ale then introduces a series of exercises in which performers in each grid location participate. The documentation ends with all participants being thanked.
Figure 4. 
Locating Grid Technologies: Performativity, Place, Space: Challenging the Institutionalized Spaces of e-Science (2cam1). This camera points towards the rear of the Physics INSORS grid node. The camera is placed at the centre of the room space.
Figure 5. 
Locating Grid Technologies: Performativity, Place, Space: Challenging the Institutionalized Spaces of e-Science (2cam2). This camera points towards the rear of the Physics INSORS grid node. The camera is placed to the right of the room space.
Figure 6. 
Locating Grid Technologies: Performativity, Place, Space: Challenging the Institutionalized Spaces of e-Science (2cam3). This camera points towards the front of the Physics INSORS grid node. The camera is ceiling mounted at the centre of the room. The projected windows of the 4 camera views and streamed screen from the Graduate School of Education node are visible.
Figure 7. 
Locating Grid Technologies: Performativity, Place, Space: Challenging the Institutionalized Spaces of e-Science (2cam4). This camera points towards the rear of the Physics INSORS grid node. The camera is placed to the left of the room space.
Figure 8. 
Locating Grid Technologies: Performativity, Place, Space: Challenging the Institutionalized Spaces of e-Science (cam3). This camera points towards the front of the Graduate School of Education INSORS grid node. The camera is ceiling mounted at the centre of the room.
Figure 9. 
Locating Grid Technologies: Performativity, Place, Space: Challenging the Institutionalized Spaces of e-Science (cam1). This camera points towards the rear of the Graduate School of Education INSORS grid node. It is the centrally placed camera in the room.
Figure 10. 
Locating Grid Technologies: Performativity, Place, Space: Challenging the Institutionalized Spaces of e-Science (cam2). This camera points towards the rear of the Graduate School of Education INSORS grid node. The camera is placed to the left of the room space.
Figure 11. 
Locating Grid Technologies: Performativity, Place, Space: Challenging the Institutionalized Spaces of e-Science (cam4). This camera points towards the rear of the Graduate School of Education INSORS grid node. The camera is placed to the right of the room space.
Figure 12. 
Locating Grid Technologies: Performativity, Place, Space: Challenging the Institutionalized Spaces of e-Science. The video is produced through the Screen Streamer function within Memetic and shows Compendium being used.

Embedded "pass-through" code

["Pass-through" code is HTML or other code embedded in the TEI file that is passed through unchanged to the browser, for instance to allow for embedded visualizations or data portals.]
[This example comes from John Bradley, “A Prosopography as Linked Open Data: Some Implications from DPRR”.] What is SPARQL? Wikipedia starts its article about SPARQL by saying “SPARQL allows users to write queries against [...] data that follow the RDF specification of the W3C”. It works by allowing the SPARQL query creator to specify a pattern to look for in the RDF graph, and to display parts of the selected bits that match the pattern as results. This certainly is not the place to provide a tutorial on SPARQL, but here is an example of a query in it:

(A screen shot of what a browser shows when this is submitted is shown in the appendix as Figure 10.)
The query looks for graph patterns in the DPRR RDF data that show women who are also recorded has holding offices, and displays the woman's name and the name of the office. It is expressed in the SPARQL language, and the reader can doubtless see that it is not a trivial matter to learn to create queries of this kind, particularly for those without knowledge of related query languages such as XQuery for XML #xquery2018, or SQL for relational databases #sql2018. However, once it has been learned, it provides a powerful way to explore a complex set of RDF data, such as that found in DPRR.
[The following example comes from Rockwell and Sinclair, “Tremendous Mechanical Labor: Father Busa’s Algorithm”.] It is also worth noting that Busa’s project was large and different enough that they had their own non-standard cards, including some with bubbles for manual pencil marks. Figure 2 shows an example card from the Busa Archive with the areas and labels related to “Philological Analysis.” These printed zones could have been used by scholars to manually add annotations as needed for human sorting as part of hybrid machine/human processing. Or they could have been used for easily reading the punched data. Either way, it was common for punched cards in those days to have to be handled by both humans and by machines #casey1951. The Index Thomisticus project was no different; there would have been a tremendous amount of human labor associated with handling the stacks of cards. The human labor would have gone into preparing cards, moving them, checking them, annotating them and even hand sorting them. It would be interesting to try to recreate the flow of Busa’s hybrid processes, but that is for another project.
[The following example is a duplicate version of the Rockwell/Sinclair example above, but embedded in a <dhq:example> with an added heading to test how that behaves.]
Example 1. 
Example Punch-card Emulator

Downloadable code and data

DHQ allows authors to link to code and data files. The links should use <ref>. Here is a link to a Python notebook and a link to the accompanying data set. If we ever implement the ability to run code notebooks directly within DHQ, the data files are also included in the data directory for this article.

Mathematical Formulas with MathML, ASCIIMath, and TeX

ASCIIMath (currently deprecated)

Samples of ASCIIMath encoding. When `a != 0`, there are two solutions to `ax^2 + bx + c = 0` and they are `x = (-b +- sqrt(b^2-4ac))/(2a) .`

TeX

Sample of TeX encoding: When \(a \ne 0\), there are two solutions to \(ax^2 + bx + c = 0\), and they are $$x = {-b \pm \sqrt{b^2-4ac} \over 2a}.$$ Furthermore, Einstein proved decisively that the relationship between energy and mass involves the speed of light, following the formula \(E=mc^2\). I have no idea what these next examples prove, but I'm sure it's important: $$ {1 \over 10} + {1 \over 100} + {1 \over 1000} + {1 \over 10,\!000} + \dots $$ I also think we should not have periods following block-level formulae. This one seems especially interesting: $$\matrix{0 &amp; 1\cr&lt;0&amp;>1}$$

MathML

MathML is not yet supported in the DHQ ODD, so the following examples are not valid. However, they should display correctly.
Inline sample 1 of MathML encoding: V = 4 3 π r 3
Inline sample 2 of MathML encoding: E = m c 2
Another inline sample. When a 0 , there are two solutions to a x 2 + b x + c = 0 and they are x = b ± b 2 4 a c 2 a .
For our study, we used the Granger causality test as follows. To identify a shaping relation, we test if variation in a specific word frequency for newspaper discourse ( Y) at time t is predicted by variation in the frequency for the same word in advertisement discourse ( X) at earlier time steps t - 1 t - k. We test for X G r a n g e r c a u s e Y, by comparing the performance of the ‘newspaper discourse only’ model:
y t = β 0 + β 1 y t - 1 + + β k y t - k + ϵ
with the full newspaper and advertisement discourses model:
y t = β 0 + β 1 y t - 1 + + β k y t - k + α 1 x t - 1 + + α m x t - m + ϵ
to identify which one does the better job at explaining word frequency ( y t ) based on the residuals. The zero-model for the hypothesis is H 0 : α i = 0 for each i of the element [ 1 , m ] with the alternative hypothesis being H 1 : α i 0 for at least one i of the element [ 1 , m ]. We applied the test bi-directionally such that a shaping relation finds support if we can confirm that ‘ X G r a n g e r c a u s e Y’, and in the case of a reflecting relationship we can reject ‘ Y G r a n g e r c a u s e X’. Finally, if both ‘ X G r a n g e r c a u s e Y’ and ‘ Y G r a n g e r c a u s e X’ find support, we viewed this as indicative of a more complex relationship between the two time series.

Notes

[1] XML, or eXtensible Markup Language, is a widely-adopted metalanguage that specifies a set of rules for encoding documents and data and for creating application- and domain-specific markup languages for encoding documents and data. XML is a recommendation of the World Wide Web Consortium. See [Bray 2008].
[2] I share with many comics creators and scholars a dissatisfaction with the term graphic novel, finding it unnecessary, misleading and perpetuating of a false distinction. Wikipedia's article on Graphic novel includes a useful summary of many of the criticisms of the term.
[3] An example is Peter David's Mascot to the Rescue! [David 2008], a superhero-themed children’s novel that integrates comics content with more traditional narrative prose. Brian Selznick’s The Invention of Hugo Cabret [Selznick 2007] is an illustrated novel, with many of the illustrations subdivided into juxtaposed images reminiscent of comics panels.
[4] Readers from the scholarly markup and digital humanities communities will be familiar with many of the general issues about text encoding discussed here. I am also hopeful that this essay will attract readers from the comics scholarship community who may not be as familiar with text encoding, and so I go into more detail about general issues of text encoding than I might otherwise.
[5] The reader wishing to test whether or not it is indeed “plausible to imagine discussions over whether a given way of marking up a text is correct or incorrect” is invited to browse the TEI Mailing List (TEI-L) Archive, where she will find many lively and energetic debates on markup practices.
[6] Hundreds of scholarly projects are based upon underlying TEI-encoded texts and data. Examples includes my own projects: The Algernon Charles Swinburne Project and Chymistry of Isaac Newton , a collaboration with William R. Newman, Professor of History of Science at Indiana University. While not an exhaustive list, many other TEI-based projects may be found at http://www.tei-c.org/Activities/Projects/.
[7] See chapter 23 “Using the TEI” of the TEI Guidelines [TEI 2010b] and Roma, an online tool for “generating validators for the TEI.”
[8] A quantitative study of the scarcity of videogames was conducted in the UK in 2008. The study took a sample of videogame consoles, as well as 50 video game cartridges for the Atari 2600, and examined their availability in archives, on internet sale sites, and as illegal ROMs for download and play on emulators. The study determined that pirated ROMs present the most available resource for games, and that all of the archives found to contain the selected games were located in the United States. The recent opening of the UK National Videogame Archive may change this dispersal problem, but the rarity of authentic, original game copies represents a challenge to those who seek to preserve them. Paul Gooding and Melissa Terras, “Grand Theft Archive: A Quantitative Analysis of the State of Computer Game Preservation,” International Journal of Digital Curation 3 (2008): http://www.ijdc.net/index.php/ijdc/article/view/85.
[9] In this paper we propose alternate conceptions of “authenticity” for video games that may align better with their complex version histories. See the “Preservation Through Adaptation” and “Patterns of Interaction Across Communities” sections, as well as “Recommendations.”
[10] For a discussion of intellectual property rights as they relate to the economic value chain of the video game industry, see James Conley, et al., “Use of a Game Over: Emulation and the Video Game Industry,” Northwestern Journal of Technology and Intellectual Property 2 (2004): http://www.law.northwestern.edu/journals/njtip/v2/n2/3/. For an opposing viewpoint — that hacking and piracy benefit society — see Will Jordan, “From Rule-Breaking to ROM-Hacking: Theorizing the Computer Game-as-Commodity,” Situated Play: Proceedings of DiGRA 2007 Conference (2007): 708–713. For a discussion of what archivists and librarians can learn from video game players about IP reform, see Kari Kraus, ‘A Counter-Friction to the Machine’: What Game Scholars, Librarians, and Archivists Can Learn from Machinimists about User Activism.” Journal of Visual Culture (special Machinima-themed issue guest-edited by the Stanford Humanities Lab) 10 (2011): 100–112.
[11] For a detailed example of the essential characteristics of videogames, and a preservation model using PLANETS, see M. Guttenbrunner, C. Becker, and A. Rauber, “Evaluating Strategies for the Preservation of Console Video Games,” Proceedings of the Fifth international Conference on Preservation of Digital Objects (iPRES 2008), London, UK, September 29–30, 2008 http://www.ifs.tuwien.ac.at/~guttenbr/pubs/guttenbrunner_ipres2008.pdf. The IMLS-funded project directed by Jerry McDonough, “Preserving Virtual Worlds II: Methods for Evaluating and Preserving Significant Properties of Educational Games and Complex Interactive Environments,” with which Kraus and Donahue are both involved, aims to address related issues.
[12]  As an example, Sierra’s Mystery House was not just a first for the company — it was the first text adventure to incorporate graphics. An amazing breakthrough at the time, it seems crude in comparison to today’s realistic 3D environments.

Works Cited

Bray 2008 Bray, Tim, et al. Extensible Markup Language (XML) 1.0. 5th ed. World Wide Web Consortium, 2008. Web. 11 May 2012. http://www.w3.org/TR/REC-xml/.
David 2008 David, Peter. Mascot to the Rescue!. New York: Harper Collins, 2008. Print.
Flanders 2005 Flanders, Julia. “Digital Humanities and the Politics of Scholarly Work.” Diss. Brown U, 2005. Web. http://dev.stg.brown.edu/staff/Julia_Flanders/pubs/flanders_dissertation.xhtml.
Lee and Kirby 1966a Lee, Stan and Jack Kirby. Fantastic Four #51 (June 1966), Canam Publishers [Marvel Comics]. Print.
McCloud 1993 McCloud, Scott. Understanding Comics: The Invisible Art. Northampton: Tundra, 1993. Print.
McGann 2001 McGann, Jerome. “Rethinking Textuality.” Radiant Textuality: Literature After the World Wide Web. New York: Palgrave, 2001. 137-160. Print.
Selznick 2007 Selznick, Brian. The Invention of Hugo Cabret. New York: Scholastic, 2007.
Sperberg-McQueen, Huitfeldt, and Renear 2000  Sperberg-McQueen, C. M., Huitfeldt, C., & Renear, A. (2000). “Meaning and interpretation of markup.” Markup Languages: Theory & Practice, 2(3), 215-34.
TEI 2010b  TEI Consortium, ed. “Using the TEI.” TEI P5: Guidelines for Electronic Text Encoding and Interchange. Vers. 1.6.0. TEI Consortium, 20 Feb. 2010. Web. 4 Apr. 2010. http://www.tei-c.org/release/doc/tei-p5-doc/en/html/USE.html.
TEI 2010c  TEI Consortium, ed. TEI P5: Guidelines for Electronic Text Encoding and Interchange. Vers. 1.6.0. TEI Consortium, 20 Feb. 2010. Web. 4 Apr. 2010. http://www.tei-c.org/release/doc/tei-p5-doc/en/html/.
CC0
To the extent possible under law, the author(s) have waived all copyright and related or neighboring rights to this work.