Current Issue

2025: 19.3

Preview Issue

2025: 19.4

Previous Issues

Indexes

Title
Author

ISSN 1938-4122

Announcements

DHQ: Digital Humanities Quarterly

Editorial

Counting Feeling: Affect Theory and Sentiment Analysis in TextBlob

Jeffrey Moro <jmoro_at_umd_dot_edu>, , University of Maryland

Introduction

Sentiment analysis encompasses a range of computational techniques for detecting and quantifying the presence of affect and emotion in written texts [Liu 2015]. Already, in this first sentence, we find sites for definitional, methodological, and even moral contestation. Across fields such as psychology, philosophy, literary studies, computer science, to say nothing of their countless interdisciplinary sub-specializations, the questions “what is emotion?”, “how is emotion inhered within or otherwise expressed through writing?”, and “how might computers extract or otherwise represent such emotion in informatic forms?” would seem to have no easy answers; for some fields, even attempting to answer them at all is a category error. For precisely these reasons, I want to offer sentiment analysis as a provocation to the digital humanities (DH). DH has long defined itself by the application of computational technologies to humanistic inquiry, and, to a lesser extent vice versa, the application of humanities questions to the study of technology itself. [1] Sentiment analysis, by appealing simultaneously to the humanistic idea of emotion and the computational idea of its quantification — all mediated through the messy, muddled, miraculous middle of text — stages some of the contradictions inherent to DH as a field while offering generative ground from which to project its possible futures, particularly in the current artificial intelligence moment.

A subfield of natural language processing (NLP), itself a field of computer science research interested in the machine address of human languages, sentiment analysis uses statistical models to attempt to measure such quantities as a text’s “polarity” (where it falls on a basic scale of positivity and negativity) or even more complex blends of emotional qualities such as anger, joy, or sarcasm. Computer science has engaged sentiment analysis since the field’s inception, with its key research questions latent in early artificial intelligence research such as those of [Turning 1950] or [Stone 1966]. Sentiment analysis was mainly a research curiosity for much of the twentieth century. It was only with the emergence of the internet and concomitant increases in computing power from the 1990s onwards that sentiment analysis became a practical undertaking. In the 2010s, sentiment analysis became more attractive to businesses: for example, through such tasks as analyzing consumers’ internet posts about a new product to determine its reception. Research towards sentiment analysis furthermore undergirds, at least in part, the large language model (LLM) boom of the early 2020s, which suggests larger political and philosophical concerns around the automated recognition of emotion in digital texts en masse.

Today, sentiment analysis undergirds consumer-facing technologies such as chatbots and algorithmic recommender systems, as well as behind-the-scenes applications such as data mining and marketing analytics. It has also found purchase in DH research. A search of this very publication surfaces roughly three dozen applications of sentiment analysis to such DH programs as the automatic recognition of rhetorical features in texts; sentiment analysis also undergirds such tools as DH scholar Matthew Jockers’ syuzhet program, a code library written in the language R for “the extraction of sentiment and sentiment-based plot arcs from text” [Jockers 2014]. As such, a critical study of sentiment analysis as a technology has the potential to speak to many aspects of contemporary algorithmic culture, alongside more theoretical and methodological questions facing those scholars interested in the computational address of texts. In this essay, I use the techniques of critical code studies to offer a case study of one such sentiment analysis tool called TextBlob. TextBlob, developed from 2013 to the present day by software engineer Steven Loria, is a Python library that provides simple, off-the-shelf tools for NLP tasks, including sentiment analysis. It has two chief virtues for this kind of critical study: first, it is sufficiently small in scope, such that scholars (myself included) might be able to practice meaningfully the kind of close-code reading that critical code studies demands; and second, TextBlob is commonly used in classroom instruction and preliminary DH research, making it an especially effective tool through which to articulate some of the implications for sentiment analysis research to DH specifically.

I’ll restrict my focus to one sub-program within TextBlob more generally: en/sentiments.py, a 97-line file that wraps and compiles functions for breaking down textual input into composite parts, assigns those parts polarity scores, and computes overall scores. [2] en/sentiments.py also contains the core implementation for two distinct sentiment analysis functions, named PatternAnalyzer and NaiveBayesAnalyzer, which affords the opportunity in this essay for their comparative study. Through this case study, I advance two intertwined claims about TextBlob. First, I map TextBlob’s reliance on a web of programmatic and textual dependencies, and how in turn, TextBlob effaces these dependencies’ formal specificities in the service of computational processing. From a technical perspective, en/sentiments.py does little computational lifting of its own. Instead, it draws together prior work from a range of other Python tools, in particular the libraries Pattern and NLTK, alongside textual corpora pre-packaged with these tools. [3] While such dependencies are commonplace in computer code, TextBlob is notable for how it instrumentalizes a range of culturally and materially disparate sources in the service of providing its putatively “objective” polarity calculations.

Mapping these dependencies supports this essay’s second claim: that TextBlob models sentiment as programmatically latent in the smallest particles of language, which it then seeks to make available for extraction and computation. TextBlob envisions sentiment as discrete and encoded within individual words, in short, as data. These data, in turn, are building blocks upon which sentiment analysis researchers develop more sophisticated models of affect, or the primary biological impulses which give rise to emotions and moods. [4] Drawing on work on affect and information by Silvan Tompkins, Eve Kosofsky Sedgwick, and N. Katherine Hayles, among others, I argue that TextBlob’s model of the relationship between sentiment, affect, and word undergirds a textual laundering inherent to much contemporary sentiment analysis, whereby culturally situated judgments (scraped, say, from corpora of book or movie reviews, both of which find their way into the TextBlob code base) come to stand for objective fact. Emotion, in turn, becomes available for computational extraction and instrumentalization. Whether or not TextBlob is empirically successful at identifying affect — if indeed affect is computationally identifiable at all — is secondary to the conceptual work of inventing affect through this textual laundering. Sentiment analysis does not measure sentiment: it creates it, and the circumstances of sentiment’s creation have significant ramifications both for the critical work of the digital humanities and our contemporary algorithmic cultures more generally.

1

The program of linking affect to data is not unique to computer science. Early in the first volume of his magnum opus Affect, Imagery, Consciousness, published in 1962, psychologist and foundational affect theorist Silvan Tomkins articulates the study of affect in relation to a then-unusual parallel field: the study of artificial intelligence. Unlike many of his contemporaries in the computer sciences (who he scorns as “temperamentally unsuited to create and nurture mechanisms” capable of true judgment [Tomkins 2008]), Tomkins argues that the capacity for computational intelligence as such depends on the creation of an “affect system.” An affect system is a theoretically discrete neurobiological system that mediates external stimuli into physical and emotional experience. Tomkins’ classic example is that of asphyxiation: the stimulus of constricted breath activates the specific affective channel of “fear,” which an individual then mediates, based on their own lived experience, into a range of emotional reaction. In this model, affect is a “system” organized around specific “programs,” which activate “rewarding and punishing characteristics,” in essence, a feedback mechanism [Tomkins 2008].

In their 1995 essay introducing Tomkins’ affect theory to literary studies, “Shame in the Cybernetic Fold: Reading Silvan Tomkins”, Eve Kosofsky Sedgwick and Adam Frank position his work within the cybernetic milieu of the post-‘45 United States. Affect theory emerges, they argue, within intellectual frameworks of systems theory, “fold[ing]” across the technological and biological, the digital and the analog [Sedgwick and Frank 1995]. For Sedgwick and Frank, Tomkins’ work reorients literary theory toward richer considerations of interdisciplinarity and “the dynamics of consensus formulation” across fields [Sedgwick and Frank 1995]. This is a particularly rich vein of consideration for the study of sentiment analysis, given that the technology is inherently polysemous, with charged terms such as “affect” and “emotion” signifying quite differently across its related fields. But more to the point, Sedgwick and Frank point to affect’s explicitly computational intellectual history as a term. Addressing affect as information or data that circulates within and fine-tunes cognitive systems, whether human or machine, is a therefore a lineage that begins not with sentiment analysis as a technology, but rather affect’s emergence as a concept more generally. [5]

Loria describes TextBlob as a “library for processing textual data” [Loria 2020]. Before it can compute relationships across data, it must first mediate texts into data. Many core NLP tasks concern the reduction of complex textual inputs into smaller lexical units, which computers can more easily process and compare statistically. One such technique deployed in en/sentiments.py is called “tokenization.” In NLP, a token refers to an arbitrary unit of lexical information. A program may tokenize a paragraph into sentences, a sentence into words, a word into syllables, or a syllable into letters, depending on the researchers’ questions and interests [Manning et al 2008]. Programs can further reduce related words (for instance, different inflections of the same verb) through computational processes such as stemming (removing affixes and suffixes) and lemmatization (removing inflectional endings, e.g., “-ly” for adverbs). TextBlob’s operative level of tokenization is that of individual words. Line 11 in en/sentiments.py imports a tokenization function from elsewhere in the code base, itself lightly adapted from and wrapping a related function in NLTK. Line 85 demonstrates this function in action, applying word_tokenize to the input text, making it subsequently available for further filtering and feature extraction.

The decision to tokenize at the level of words has two consequences for TextBlob’s sentiment analysis functions. First, it renders word order irrelevant. For example, PatternAnalyzer, TextBlob’s primary sentiment analysis function (recall that it has two built in) assigns the example sentence “I love those who hate me” a –0.15 polarity score, indicating slight negativity. Rearranging the sentence to “I hate those who love me,” theoretically reversing the sentence’s meaning, returns the same score. From this follows the second consequence: namely, that TextBlob inheres affective meaning within single words, stripped bare of syntactical context and morphological derivation. A programming choice perhaps designed to lower the computational load thus produces a cognitive and conceptual model. [6] Furthermore, TextBlob imagines sentiment within these words as mathematically fixed, capable of adding to or subtracting from the sentiments of other words, but not changing based on other linguistic characteristics.

Due to the nested dependencies and imported functions in en/sentiments.py, it takes sleuthing to see where and how TextBlob defines these eternal constants. For PatternAnalyzer, the process of assigning and computing polarity scores happens with the terse invocation of pattern_sentiment(text), a function imported into, rather than defined within, this file. We can follow this function along a chain of interoperable files to en-sentiment.xml, a lexicon file containing 2,918 individual words with associated polarity scores. Entries follow a standardized format:

$$<sentiment language="en" version="1.3" author="Tom De Smedt, Walter Daelemans" license="PDDL">...<word form="airheaded" cornetto_synset_id="n_a-507793" wordnet_id="a-02120828" pos="JJ" sense="lacking seriousness" polarity="0.5" subjectivity="1.0" intensity="1.0" confidence="0.8" /><word form="alarming" cornetto_synset_id="n_a-527099" wordnet_id="a-00193015" pos="JJ" sense="frightening because of an awareness of danger" polarity="-0.1" subjectivity="0.6" intensity="1.0" confidence="0.8" /><word form="alas" wordnet_id="" pos="UH" polarity="-0.4" subjectivity="1.0" intensity="1.0" confidence="0.8" />...</sentiment>$$ [7]

Just as Loria imports the Pattern library’s algorithmic functionality, so too does he re-use De Smedt and Daelemans’ lexicon. In their 2012 paper introducing Pattern, De Smedt and Daelemans describe using the web scraping tool to produce the lexicon itself: “We mined online Dutch book reviews and extracted the 1,000 most frequent adjectives. These were manually annotated with positivity, negativity, and subjectivity scores,” a task they repeated with a number of European languages, including English [De Smedt and Daelemans 2012]. “Manual annotation” means exactly what it sounds like: De Smedt and Daelemans hand-tagged (or computationally inferred, for words with related senses and meanings) words with polarity scores. These scores came from their own critical judgment rather than any computational process. While this practice may raise eyebrows to an audience of digital humanists — it seems an excellent way to encode a whole range of unattested biases and subjectivities into the program, for one — it’s a standard approach in NLP. While programmers often document such development processes in code comments and research papers, these innate subjectivities get effaced in the movement from data to code. Here I want to emphasize that I am not accusing Loria or any NLP researchers of intentionally black-boxing the textual laundering that necessarily occurs in the production of these corpora. De Smedt and Daelemans have accounted for their methodologies in their published work, and these methodologies follow best practices within their field. Rather, my point is that the operation of encoding these corpora within TextBlob or any similar sentiment analysis tool is where effacement occurs. In order to make language “work” as data, TextBlob strips it clean of context, even as that context is what allowed the initial humans composing these corpora to make judgments in the first place.

While PatternAnalyzer uses simple averaging to compute its scores, the second sentiment analysis program mapped in en/sentiments.py, named NaiveBayesAnalyzer, deploys machine learning techniques. True to its name, it uses a naïve Bayes algorithm to assign probable (rather than definitive) polarity scores. Named after Reverend Thomas Bayes, an eighteenth-century English mathematician, a Bayesian algorithm is one based off Bayes’ Theorem, which offers a simple yet effective statistical model for predicting the likelihood of a given event. [8] A Bayesian algorithm is “naïve” when it presumes that all units of information are discrete and have no other statistically meaningful relationships: for example, when one computes a sentence’s polarity scores based on individual words alone, regardless of syntactical context. Even given NaiveBayesAnalyzer’s relative sophistication when compared to PatternAnalyzer, they share many of the same conceptual underpinnings. Both decompose inputs into individual words, in doing so obviating finer points of lexical and syntactical relation. I make this observation, once again, not to dismiss the projects of either; sentiment analysis is an extraordinarily taxing operation, both at the level of conceptual development and machine operations. The task demands simplification, compression, and abstraction. What I am suggesting is that programs such as TextBlob could do more, from a user perspective, to flag the necessarily subjective models that underpin their analysis — particularly in the LLM moment, when the validity of automated computer-generated textual analysis is increasingly taken as a given by the public at large. Doing so would necessarily undercut sentiment analysis’s claims to objective utility. But by contrast, it would more accurately articulate what the technology actually does: namely, develop (imperfect, partial, but potentially useful) models of critical judgment.

Given that TextBlob provides two distinct sentiment analysis implementations, it provides us the opportunity to see how such models diverge. The following code runs the first sentence of this essay through both PatternAnalyzer and NaiveBayesAnalyzer:

$$# Runs the example sentence through `PatternAnalyzer`>>> from textblob import TextBlob>>> blob = TextBlob("Sentiment analysis encompasses a range of computational techniques for detecting and quantifying the presence of affect and emotion in written texts.")>>> blob.sentiment.polarity0.0 # `PatternAnalyzer` returns a score of zero, indicating no positivity or negativity. In reality, this means that none of the sentence's words were present in `en-sentiment.xml`. # These next commands call `NaiveBayesAnalyzer` specifically.>>> blob = TextBlob("Sentiment analysis encompasses a range of computational techniques for detecting and quantifying the presence of affect and emotion in written texts.", analyzer=NaiveBayesAnalyzer())>>> blob.sentimentSentiment(classification='pos', p_pos=0.9916363422665231, p_neg=0.008363657733478571)# `NaiveBayesAnalyzer` returns a sharply positive score, with a hint of negative characteristics.$$

Here, the model’s limitations become clear. While PatternAnalyzer appears to compute the sentence as perfectly neutral, in fact the score of 0.0 indicates that none of its constituent words appeared in De Smedt and Daelemans’ lexicon. NaiveBayesAnalyzer fares a tad better, given that it returns a score at all. However, as a human reader (to say nothing of the sentence’s author), I will admit that its sharply positive assessment gives me pause. Where do we locate such affirmative affect within an admittedly and self-consciously dry academic sentence? The gulf between human and machine interpretation becomes evident. N. Katherine Hayles argues that machine reading is principally distinct from human reading in its focus on interior, statistical connections between textual data rather than expansive, mediated contexts [Hayles 2018]. This is not to privilege one form of reading over another, although certainly from a human perspective, we may find the machine’s ability to “correctly” interpret these texts lacking. On the one hand, from my position as a critic trained in literary and media studies, I am skeptical that TextBlob’s sentiment analysis functions can do what they claim. I root this skepticism not in any particular valorization of human judgment (although my ethical and political convictions in the LLM moment admittedly encourage me to do so), but rather in the machine capacity to model human judgment through any statistical system, whether one as simple as TextBlob’s or as potentially complex as ChatGPT’s. On the other hand, I am intrigued by sentiment analysis’s capacities to model different forms of critical judgment, and in doing so remove them from an exclusively human context. When NaiveBayesAnalyzer returns a sharply positive score for my own prose, I take that as a moment less to disagree with the machine’s interpretation and more to ask after the conditions that gave rise to such an interpretation: what the machine is doing with and to my text. This may not tell me anything new about the sentence itself, or at least nothing that I could not have already said as a human reader, but it does have the capacity to tell me more about the underlying models that power the program’s interpretation. Texts, it turns out, are not quieted so easily.

PatternAnalzyer and NaiveBayesAnalyzer differ not only in how they compute sentiment statistically, but also in the root sources of their judgments. PatternAnalzyer, we have seen, draws its claims from the hand-tagged Pattern lexicon, in which each individual word arrives imputed with a discrete sentiment score, manually tagged by human readers. NaiveBayesAnalyzer, conversely, derives root scores programmatically by training a rudimentary machine learning model on a corpus of movie reviews assembled by NLP researchers Bo Pang and Lillian Lee [Pang and Lee 2005]. Pang and Lee’s work is influential in NLP and sentiment analysis research for addressing various computational problems related to the relationships between ratings systems and language: is there, for instance, a textual pattern, discernable at the level of tokenization, that distinguishes a two-star review from a four-star review? The various corpora they have assembled over the years as part of this research program have subsequently become popular readymades for programmers like Loria, who need well-formatted textual-numerical data upon which to derive sentiment analysis models like NaiveBayesAnalyzer. In TextBlob, this training happens in lines 72–81:

72 """Train the Naïve Bayes classifier on the movie review corpus."""73 super(NaiveBayesAnalyzer, self).train()74 neg_ids = nltk.corpus.movie_reviews.fileids('neg')75 pos_ids = nltk.corpus.movie_reviews.fileids('pos')76 neg_feats = [(self.feature_extractor(77 nltk.corpus.movie_reviews.words(fileids=[f])), 'neg') for f in neg_ids]78 pos_feats = [(self.feature_extractor(79 nltk.corpus.movie_reviews.words(fileids=[f])), 'pos') for f in pos_ids]80 train_data = neg_feats + pos_feats81 self._classifier = nltk.classify.NaiveBayesClassifier.train(train_data)

In plain English, these lines call various submodules from NLTK (another dependency moment) to define both positive and negative “features,” or numerical representations of sentiment scores, derived from the Pang and Lee movie review corpus. These features are then assembled in line 80 into training data, which are used in line 81 to train the classifier itself.

We are now looking the black box in its vacant eye. What we see is both opaque and prosaic: opaque in that the fundamental features upon which NaiveBayesAnalyzer construes its judgments are hidden from us; yet prosaic in that we see, as plainly as one and one make two, the roots of the training data in line 80. That these data are derived from movie reviews is not — cannot be — incidental to NaiveBayesAnalyzer’s judgment, for the same reason that no human judgment is arbitrary (even if it can be capricious): there are always priors, and those priors ramify for the simple reason that if they were different — if the training data were otherwise — the judgments would be too. Here, I want to once again stay cautious about the applicability of generalizing familiar models of human judgment onto machines. We can speculate on the specific formal consequences of the Pang and Lee corpus, derived as it is from movie reviews, being so popular as a readymade in NLP implementations. How one speaks of a movie is not the same as how one speaks of a device, or a student, or a loved one, or the weather, or any of the countless topics upon which we are called to issue judgments in the contemporary world. But given the classifier’s opacity, even with access to the features themselves it would be challenging to claim with any certainty the existence of causal relationships between the logic of movie reviews and subsequent algorithms. (It might be, for instance, just as likely that any determinative logic from the movie reviews get thoroughly decomposed in the classification process, a regression-to-the-mean familiar to any user of an LLM, in which the particularities of underlying corpora evaporate into a vague grey goo of text.)

What we can say, however, is that the Pang and Lee corpus of movie reviews, alongside similar corpora of product reviews or social media posts, emerge from a shared rhetorical situation: namely, one in which participants in the modern internet are invited, repeatedly and at length, to offer their opinion about things and to encode those opinions within numerical systems. The endless invitation to grade objects, experiences, and our fellow humans on a five-star scale is one of the underappreciated burdens of modern life. It is a computational-capitalist logic that mediates the actually existing world into datasets that are subsequently available for precisely the kinds of implementations we find in TextBlob: pieces of software that emulate human judgment. Or rather, that emulate humans who have been prompted to behave as machines — not judging but rating, our messy, textured opinions reduced to a clean, legible numerical scale. We have come full circle, it seems, from Tomkins finding in the machine a model for human emotion and cognition. Now we have machines that replicate humans behaving like machines.

2

en/sentiments.py provides a brief if telling glimpse into the conceptual assumptions and encoded infrastructures underpinning sentiment analysis: that language is fundamentally reducible to mathematical information; that statistical techniques can construe meaningful relationships across this information; and that these mathematical relationships can mean things to the register of language. In addressing affect specifically, en/sentiments.py participates in what Patricia Ticineto Clough and her collaborators have called the “datalogical turn,” or “how the algorithms that parse big data are an intensification of . . . [an] unconscious drive to empiricism, positivism, and scienticism” [Matviyenko 2025]. To these we could also add work by scholars such as Safiya Noble and Ruha Benjamin on algorithmic bias, how “automatic” judgments are never thus, but rather encode the presumptions of those humans who create, use, and maintain judging technologies [Noble 2018], [Benjamin 2019].

I have argued in this essay that there is a fundamental slippage between what sentiment analysis claims to do and what it actually does. Rather than identify and quantify empirically existing affect within language (a contestable claim in the best of circumstances), sentiment analysis as implemented in TextBlob creates affect as a schema from a heterogeneous array of pre-existing judgments, in turn flattening these judgments into a computational voice of God, discerning from at once everywhere and nowhere. While NLP research understandably focuses on improving the accuracy of such underlying models — “accuracy” here serving synecdochally for concepts such as neutrality, objectivity, and factuality — my interest as a digital humanist in technologies such as TextBlob is not with their always deferred capacity to determine a text’s affect, but rather with affect’s participation in a theoretical (and indeed, quite material) project of equating the human spirit to that of the machine. [9] As digital humanists, we have an obligation to surface the cultural work that technologies perform before adopting them into our enterprise. In the case of sentiment analysis, this entails acknowledging how the technology’s epistemic operations rest on the elimination (or at best instrumentalization) of context — perhaps the fundamental unit of humanistic inquiry.

I am not dissuading scholars from using these tools; I myself have found them generative in my work chiefly for the strangeness of their readings. (I am not alone, I suspect, in finding the whole AI enterprise far more interesting when its outputs resembled humans less and machines more.) I hope in this case study to have suggested terrain for further experimentation with sentiment analysis, if for nothing else than as models for ways to work with the technologies that are different from the tech industry’s naked denigration of human judgment. These are qualities that a critical code studies reading can surface, and that I argue are integral to reshaping such technologies along more creative, expressive, and ethical lines.

Annotations

File: en/sentiments.py
Programming language: Python
Developed: 2013–Present day
Principal author: Steven Loria
Platform: Cross-platform (Windows, macOS, Linux)
Libraries used: NLTK
Source file: https://github.com/sloria/TextBlob/blob/dev/textblob/en/sentiments.py
Interoperating files: base.py, sentiments.py, text.py, en/init.py, en/en-sentiment.xml

Annotation continued

$$# -*- coding: utf-8 -*-$$
$$"""Sentiment analysis implementations.$$
$$.. versionadded:: 0.5.0$$
$$"""$$
$from_future_import absolute_import$
$from collections import namedtuple$
import nltk
$from textblob.en import sentiment as pattern_sentiment$
$from textblob.tokenizers import word_tokenize$
$from textblob.decorators import requires_nltk_corpus$
$from textblob.base import BaseSentimentAnalyzer, DISCRETE, CONTINUOUS$
$class PatternAnalyzer(BaseSentimentAnalyzer):$
$$"""Sentiment analyzer that uses the same implementation as the$$
$$pattern library. Returns results as a named tuple of the form:$$
$$``Sentiment(polarity, subjectivity, [assessments])``$$
$$where [assessments] is a list of the assessed tokens and their$$
$$polarity and subjectivity scores$$
$$"""$$
$kind = CONTINUOUS$kind = CONTINUOUS
$$$$# This is only here for backwards-compatibility
$$$$# The return type is actually determined upon calling analyze()
$RETURN_TYPE = namedtuple('Sentiment', ['polarity', 'subjectivity'])$
$def analyze(self, text, keep_assessments=False):$
$$"""Return the sentiment as a named tuple of the form:$$
$$``Sentiment(polarity, subjectivity, [assessments])``.$$
$$"""$$
$$#: Return type declaration$$
if keep_assessments:
$$Sentiment = namedtuple('Sentiment', ['polarity', 'subjectivity', 'assessments'])$$
$assessments = pattern_sentiment(text).assessments$
$polarity, subjectivity = pattern_sentiment(text)$
$return Sentiment(polarity, subjectivity, assessments)$
else:
Sentiment = namedtuple('Sentiment', ['polarity', 'subjectivity'])
return Sentiment(*pattern_sentiment(text))
$def defaultfeatureextractor(words):$
$$"""Default feature extractor for the NaiveBayesAnalyzer."""$$
$return dict(((word, True) for word in words))$
$class NaiveBayesAnalyzer(BaseSentimentAnalyzer):$
$$"""Naive Bayes analyzer that is trained on a dataset of movie reviews.$$
$$Returns results as a named tuple of the form:$$
$$``Sentiment(classification, ppos, pneg)``$$
$$:param callable featureextractor: Function that returns a dictionary of$$
$$features, given a list of words.$$
$$"""$$
$kind = DISCRETE$
$$#: Return type declaration$$
$RETURNTYPE = namedtuple('Sentiment', ['classification', 'p_pos', 'p_neg'])$
$def __init__(self, feature_extractor=_default_feature_extractor):$
$super(NaiveBayesAnalyzer, self).__init__()$
$self._classifier = None$
$self.feature_extractor = feature_extractor$
$@requires_nltk_corpus$
$def train(self):$
$$"""Train the Naïve Bayes classifier on the movie review corpus."""$$
$super(NaiveBayesAnalyzer, self).train()$
$neg_ids = nltk.corpus.movie_reviews.fileids('neg')$
$pos_ids = nltk.corpus.movie_reviews.fileids('pos')$
$neg_feats = [(self.feature_extractor($
$$nltk.corpus.movie_reviews.words(fileids=[f])), 'neg') for f in neg_ids]$$
$pos_feats = [(self.feature_extractor($
$$nltk.corpus.movie_reviews.words(fileids=[f])), 'pos') for f in pos_ids]$$
$$train_data = neg_feats + pos_feats$$
$self._classifier = nltk.classify.NaiveBayesClassifier.train(train_data)$
def analyze(self, text):
$$"""Return the sentiment as a named tuple of the form:$$
$$"""$$
$$# Lazily train the classifier$$
$super(NaiveBayesAnalyzer, self).analyze(text)$
$filtered = (t.lower() for t in tokens if len(t) >= 3)$
$feats = self.feature_extractor(filtered)$
$prob_dist = self._classifier.prob_classify(feats)$
$return self.RETURN_TYPE($
$$classification=prob_dist.max(),$$
$$p_pos=prob_dist.prob('pos'),$$
$$p_neg=prob_dist.prob("neg")$$
$$)$$

Notes

1–4: TextBlob’s release in 2013 entered into a fruitful space for sentiment analysis. The rising wave of Web 2.0 and social media more generally encouraged internet users to enter unprecedented amounts of self-authored text into their machines, in turn finally providing data sets at the scale required for meaningful analysis. Development on TextBlob was at first fast and dense, although the pace of its development slowed over time. Loria added sentiment analysis to TextBlob in version 0.5.0, published on 10 August 2013. According to the changelog on GitHub, where TextBlob’s code is hosted, en/sentiments.py was most recently updated on 2 December 2017.

5–13: Python programs customarily include dependencies, or external programs which files require in order to operate, as “imports” at the beginning of code. Notably, line 8 imports NLTK; line 10 imports a function called sentiment from the file en/__init__.py, a lightly revised version of Pattern’s sentiment analysis implementation; and line 13 imports wrapper functions that define the basic form of both PatternAnalyzer and NaiveBayesAnalyzer. As noted in the essay, this reliance on Pattern’s sentiment analysis implementation, while labor-saving on the level of programming, means that TextBlob simply copies wholesale Pattern’s approach, which relies on the linear calculation of sentiment from a pre-scored corpus.

16: The remainder of en/sentiments.py defines two Python classes, one for each implementation. Loria begins with PatternAnalyzer, most likely due to its both conceptual and computational simplicity. NaiveBayesAnalyzer, by contrast, seems relegated to a secondary or experimental role.

28: Loria includes an if/else function in PatternAnalyzer to handle whether the user wants optionally to return “assessments,” or the major lexical criteria upon which the function determines a sentence’s polarity. Obscuring the assessments by default produces a cleaner, more “objective” reading on the command line at the expense of the more verbose output that would articulate the program’s reasoning. Again, a reasonable choice from a user experience perspective, albeit one that encourages further interpretation of the machine’s acts as somehow “neutral.”

36, 41: In both the if and else parts of this function, the actual work of computing polarity occurs in pattern_sentiment(text), which applies Pattern’s sentiment analysis function, imported in line 10, to the given input text. While not included directly in en/sentiments.py, this function averages together assigned polarity scores drawn from a lexicon file. As noted in the essay, these polarity definitions were manually generated by De Smedt and Daelemans’ teams in the process of training the dataset. We can only speculate on their exact identities, but this workflow more generally, which was customary in natural language processing work at the end, reminds us how ultimately judgment is a capacity of human brains, which must be in turn extracted, modeled, and mediated by the program.

44: In preparation for NaiveBayesAnalyzer, Loria includes a brief definition of a feature extractor. In NLP, a feature extractor parses input for major “features,” or statistically significant lexical data. Again, simplification and compression permit Loria to perform this computational work in such a relatively parsimonious package. In particular, Loria’s approach here eliminates contextual relations between words, information that later generations of sentiment analysis have attempted to consider in more depth. (For more on this, see footnote no. 6).

50: NaiveBayesAnalyzer uses machine learning functions to train its analysis on a dataset of movie reviews included as an example corpus in the NLTK code base. This corpus comprises two hundred movie reviews, half tagged positive and half tagged negative, assembled in 2004 by computer scientists Bo Pang and Lillian Lee as a tool for sentiment analysis projects. The corpus and its associated research papers are available at https://www.cs.cornell.edu/people/pabo/movie-review-data/. These reviews are imported in line 12. Movie reviews, alongside product reviews, are typical source corpora for sentiment analysis implementations, particularly in the time of TextBlob’s most intensive development. They have the advantage of being both strongly opinionated by definition as well as accompanied often by a numerical value. As such, they offer a sort of “readymade” corpus for the work of sentiment analysis. The catch, of course, is that, as I discuss in the body of the essay, sentiment analysis not only mediates judgment, but also the form of those judgments. This makes tools derivate of the Pang and Lee corpus perhaps effective at ascertaining the sentiment from textual inputs that bear more than a passing similarity to movie reviews, but less effective at sentiment encoded in other kinds of linguistic forms.

61: Defines a subsequent feature extractor using the form previously defined in line 44. Most users will not vary their usage beyond PatternAnalyzer, making NaiveBayesAnalyzer more of an easter egg for those willing to read the documentation. This is a shame, given that I view TextBlob’s most valuable contribution to the pedagogical space around sentiment analysis precisely the ease with which users can juxtapose the outputs of these two distinct implementations, and thereby understand the contingency of the machine’s judgments. One might imagine a version of TextBlob that puts notions of machinic transparency first, that foregrounds both the assessments (easily editable in line 28), as well as offers users a choice between PatternAnalyzer and NaiveBayesAnalyzer.

66: Indicates that this function requires the user to download the NLTK corpus to their computer in order to run. The corpus is a small file all told, but perhaps this requirement is what shied Loria away from making NaiveBayesAnalyzer a more readily available option. The download requirement places a small but meaningful load on the end user to have a working internet connection, introducing yet another dependency to the program’s operations — in this case, external networking technologies. That this fundamental dependency on the internet’s servers and the electrical grid more generally are required to download and install TextBlob in the first place would seem to make this an unnecessary point of contention, but one can understand how Loria would shy away from design choices that would take TextBlob further away from an all-in-one approach.

67–77: NaiveBayesAnalyzer comprises two major functions: a training and an analysis function. This first training function identifies which reviews in the Pang and Lee corpus are tagged positive or negative; extracts key textual features from each; combines them into a single variable named train_data; and then uses NLTK’s built-in algorithms to train a classifier. One might reasonably ask whether such an emphasis on positive and negative polarity is a useful heuristic for these reviews; what, for example, to make of mixed reviews, or ones that are neither strongly positive nor negative? This is a moment of disjuncture between the initial goals of Pang and Lee and similar natural language processing researchers and these programs’ implementations in, say, consumer-facing technologies. Reading Pang and Lee’s original paper, one comes to understand their project less as “attempting to design a movie-review-reader program” and more “attempting to solve specific problems in the computational evaluation of language, with movie reviews providing an effective starting point.” Indeed, there are several issues with the original approach that they themselves identify; one, for instance, is how to handle reviews that “turn,” so to speak, that pile up negative language only to reveal at the end that the user loved the film. (A not-unfamiliar move to those of us who love “bad” films.) For Pang and Lee, these are computational problems worthy of further investigation. However, derivative work — work for which Pang and Lee are dependencies, to use the language we have developed in this essay — strays from this initial pure-research vision. Even TextBlob, which still lingers at the edges of academic research, contains, as we have shown, many design decisions intended to emphasize the seeming neutrality and objectivity of its judgments. Tracing these histories through an analysis of the program’s code reveals the ease with which subjectivity slips into objectivity in the case of sentiment analysis.

79–93: These lines define NaiveBayesAnalyzer’s analysis function, where the sentiment analysis work actually happens. Lines 85–87 break down the text into individual features, which are then classified using a probability distribution function in line 88. Lines 89–92 print the results to the command line using the schema defined in line 52. Line 90 is particularly interesting in that it omits words of fewer than three characters. Loria is making a design choice here. Eliminating short words reduces the computational overhead, which for Naïve Bayes-based methods can be substantial, while theoretically leaving untouched all the more “meaningful” words. However, as work in distant reading has demonstrated, it is precisely in the shortest words of the English language — the articles, the conjunctions — that meaning is often expressed. Again, this turns to the problem of context in sentiment analysis: retaining these shorter words, which are often the glue holding together syntax, could allow for a more sophisticated reading of how meaning is constructed in relation. However, given TextBlob’s stated design purpose of being a simple, off-the-shelf tool for rapid sentiment analysis work, it’s understandable that Loria would eschew more complex approaches — some of which simply were unavailable to him in 2013 when he originally designed this program — in favor of more rapid response.

Acknowledgments

I would like to thank Mark Marino and Jeremy Douglass for their perceptive comments that have helped shape and expand this essay over its writing, as well as the anonymous reviewers for DHQ. Much of this essay was originally written during a fellowship with the Maryland Institute for Technology in the Humanities in the fall of 2020; I am particularly grateful to Ed Summers for his patience and generosity in teaching me the fundamentals of Python so that I might engage this project. Thanks are due also to Kari Kraus for suggesting TextBlob as a useful object for critical research on sentiment analysis, and to Alice Bi for being a thoughtful and generative interlocutor on the question of LLMs’ relationship to sentiment analysis more generally.

Notes

[1] I am here paraphrasing Kathleen Fitzpatrick’s famous definition of the digital humanities from her 2010 blog post on the Chronicle of Higher Education’s now-defunct ProfHacker blog. [Fitzpatrick 2012].

[2] I include the leading en/ to distinguish from another file in the TextBlob code base named sentiments.py.

[3] For the code base of Pattern, see [De Smedt and Daelemans 2012]. For NTLK, see [Bird et al 2009].

[4] “Sentiment,” “affect,” “emotion,” and “mood” are often used interchangeably in sentiment analysis research, even as some scholars such as Liu seek to disambiguate them. For the purposes of my study, I define “sentiment” as an imputed judgment by sentiment analysis software about a given input — a polarity score, for instance. “Affect,” by contrast, is a theoretical aspect of human biopsychology that gives rise to emotions and moods. One might then use calculated sentiment to develop an affective schematic model, for instance.

[5] Here I am also thinking with N. Katherine Hayles’ work on “cognitive assemblages,” whch she envisions as intelligences that exceed the human, animal, or machine. See [Hayles 2017].

[6] Subsequent and more sophisticated work in sentiment analysis has attempted to resolve the challenges of this single-word model with deeper considerations of the contexts between and across words. One such technique is Bidirectional Encounter Representations from Transformers (BERT), created in 2018 by researchers at Google. BERT-based sentiment analysis techniques take into consideration the words both before and after individual words when assigning values, and as such has more potential for both sentiment analysis and language prediction. BERT and derivative technologies are also foundational to later work on LLMs, which follow from but are conceptually and materially distinct from sentiment analysis. The downside of BERT is that it is computationally much more intensive than simpler tools such as TextBlob or even NLTK. Sheer processing power is easier to come by with each passing year — although it remains an open question as of this writing whether or not the “more power = better data” approach currently being tested by major tech companies such as OpenAI and Microsoft will pay off in anything other than ecosystem devastation — but still serves as a material limiting factor. See #devlin_etal2019 for further consideration of BERT.

[7] This .xml file has been imported wholesale, without any changes, from Pattern. This leads to some curious inconsistencies in the TextBlob code base: a heaqder in the .xml file, for instance, claims that the reliability score, which specifies whether a value was hand-tagged or computationally inferred, takes either the value of 1.0 (for the former) or 0.7 (for the latter). However, many entries carry the unexplained value 0.9. Subjectivity scores, which putatively name a where a word falls on a subjective/objective axis, are calculated in TextBlob by not printed to the command line.

[8] Bayes’ Theorem takes the form P(A|B) = (P(B|A)P(A)/P(B)). In plain English, we might state this as “the probability of event A occurring given the truth of event B is equal to the probability of event B occurring given the truth of event A, multiplied by the probability of event A occurring in itself, all divided by the probability of event B occurring in itself.” Essentially, Bayes’ Theorem provides an approach to defining the conditional probability of an event’s occurrence. Bayes’ Theorem serves as a basis for many NLP tasks and classifier functions given its computational simplicity relative to the accuracy of its results.

[9] Here I am thinking also with Wendy Hui Kyong Chun’s work in her 2008 monograph Programmed Visions: Software and Memory on the cybernetic project of establishing precisely this equivalence. For Chun, early cybernetic research on machine intelligence that took for its putative model the human brain ineluctably doubled back, such that machine became the primary operative metaphors through which cognitive science came to understand the brain. See [Chun 2008].

Works Cited

Benjamin 2019 Benjamin, R. (2019) Race after technology: Abolitionist tools for the new jim codePolity Press.

Bird et al 2009 Bird, S., Loper, E., and Klein, E. (2009) Natural language processing with Python. Cambridge: O’Reilly Media.

Chun 2008 Chun, W.H. K. (2008) Programmed visions: Software and memory. Cambridge: MIT Press.

De Smedt and Daelemans 2012 De Smedt, T, and Daelemans, W. (2012) “Pattern for Python”, Journal of Machine Learning Research 13, pp. 2063–67.

Fitzpatrick 2012 Fitzpatrick, K. (2012) “The humanities, done digitally”, Debates in the Digital Humanities, ed. Matthew K. Gold. Minneapolis: University of Minnesota Press, pp. 12–15.

Hayles 2017 Hayles, N. K. (2017) Unthought: The power of the cognitive nonconscious. Chicago: University of Chicago Press.

Hayles 2018 Hayles, N. K. (2018) “Human and machine cultures of reading: A cognitive-assemblage approach”, PMLA 133.5, pp. 1225–42.

Jockers 2014 Jockers, M. (2014) “syuzhet”. Github.com. https://github.com/mjockers/syuzhet. (Accessed 22 Oct 2025).

Liu 2015 Liu, B. (2015) Sentiment analysis: Mining ppinions, sentiments, and emotions. New York: Cambridge University Press.

Loria 2020 Loria 2020 Loria, Steven. “TextBlob: Simplified Text Processing TextBlob 0.16.0 Documentation.” https://textblob.readthedocs.io/en/dev/, 2020, accessed 1 May 2020.

Manning et al 2008 Manning, C. D, Raghavan, P. and Schütze, H. (2008) Introduction to information retrieval. New York: Cambridge University Press.

Matviyenko 2025 Matviyenko, S. (2025) “On governance, blackboxing, measure, body, affect, and apps: A conversation with Patricia Ticineto Clough and Alexander R. Galloway”, The Fibreculture Journal, no. 25.

Noble 2018 Noble, S. U. (2018) Algorithms of oppression: How search engines reinforce racism. NYU Press.

Pang and Lee 2005 Pang, B. and Lee, L. (2005) “Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales.” In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics - ACL ’05, 115–24. Ann Arbor, Michigan: Association for Computational Linguistics, 2005.

Sedgwick and Frank 1995 Sedgwick, E. K. and Frank, A (1995) “Shame in the cybernetic fold: Reading Silvan Tomkin”, Critical Inquiry 21.2, pp. 496–522.

Stone 1966 Stone, P. J. The general inquirer: A computer approach to content analysis. Cambridge: MIT Press.

Tomkins 2008 Tomkins, S. S. (2008) Affect imagery consciousness: The complete edition. New York: Springer Publishing.

Turning 1950 Turning, A. (1950) “Computing machinery and intelligence”, Mind 59.236, pp. 433–60.

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.

URL: https://dhq.digitalhumanities.org/editorial/000825/000825.html
Comments:
Published by: and
Affiliated with: Digital Scholarship in the Humanities
DHQ has been made possible in part by the National Endowment for the Humanities.
© 2026 the author

Unless otherwise noted, the DHQ web site and all DHQ published content are published under a Creative Commons Attribution-NoDerivatives 4.0 International License. Individual articles may carry a more permissive license, as described in the footer for the individual article, and in the article’s metadata.