Having worked on two topic modeling projects this year (a final project for Clio I and a topic modeling of the THATCamp proceedings undertaken by the Digital History Fellows at CHNM), I walked into this grouping of readings with relative confidence.
A confidence that was quickly deflated when confronted with how little I truly understood when I completed those projects. This week, Ted Underwood, Lauren F. Klein, David M. Blei, Andrew Goldstone, Elijah Meeks, Lisa Rhody and Ben Schmidt exposed me to the possibilities and pitfalls of using LDA topic modeling. Given what I learned, I’ve a real inclination to revisit those projects at another time.
There were a few areas in particular that drew my attention and raised interesting questions for me regarding the analysis and interpretation of the results that topic modeling produces.
A fundamental concern relates to the fact that topic modeling analyzes texts by counting and grouping tokens into models. This word-based analysis should be subject to a good deal of consideration. Rhody, and the others, emphasize the value of LDA topic modeling as “revealing patterns and relationships that might otherwise have remained hidden.” A great benefit indeed for historians interested in considering new approaches. However, a considerable barrier in the application of digital tools is, and will continue to be, a lack of understanding regarding the results.
How effective is this process if the conclusions you draw “will be limited to those who understand how topic modeling works” (Schmidt) ? How do non-digital scholars make sense of topic modeling? Furthermore, as researchers, what other pitfalls are hidden in the data, overlooked because we don’t know what we don’t know?
This week Amanda shared with us a link to Goldstone and Underwood’s PMLA research data (available here) and what struck me as I looked through it was two thoughts: one, this looks beautiful, and two, what does any of this mean? For me this highlights the difficulty in communicating the work of digital tools like topic modeling to non-digital historians. I think it is an impressive feat, to distantly read a large corpus of documents, but I’m not convinced of the efficacy in the face of concerns raised across the readings about whether or not we even understand the “topics” we produce.
Topic modeling is not without the interpretation and close-reading associated with traditional historical research, but machine reading still causes some discomfort for me. The privileging of the text and the stripping of context make me nervous and I echo the concerns we read about word meanings and changing terminologies.
In her explanation of the process of topic modeling, Rhody uses the analogy of a farmers’ market to describe the computational processes that occur within the program. The machine simply reads the contents of the baskets and identifies patterns. The process seems simple when you describe how “a pear is put in the basket with other pears”, but I found myself wondering about instances when the fruit looks like a pear but is decidedly different; a pear-apple, or a green apple, or a watermelon.
Any researcher knows that information can be categorized in different ways with different meanings. Information about a single deaf church (if you share my research interests) involves a relationship to a constellation of things; the founding reverend, members of the clergy, members of the congregation, the religious beliefs, local churches, deaf organizations, the wider city deaf community and its members. I am able to glean a great deal about these things through their relationship with one another. One would hope that a machine reading of historical documents would produce topics that could highlight these relationships, or at least reproduce them in a meaningful way.
The problem I face, however, is that the term “deaf” is not a static phrase. It is not one that is only used to refer to deaf people, nor is it the only phrase that has historically referred to deaf people.
Seeking this information involves wading through countless instances where the phrase is used metaphorically (“deaf to their pleas”). It also involves using terms like “deaf-and-dumb”, “deaf-mute”, “speechless” and “mute” with an awareness of the time period and context. In the graph above I’ve used the Chronicle (a tool that examines language use in New York Times reporting from 1850) to demonstrate how these terms have changed in popularity over time and to demonstrate how the term “deaf” appears much more frequently.
These types of problems led Underwood to focus on topics rather than words – in this sense, words are given context. Schmidt remained critical of topics as well and demonstrated how clustered terms may be combined in a single topic – operating in separate directions.
Still, despite the skepticism I sometimes feel, there is something meaningful in the findings that topic modeling and text mining projects produce. And I can’t overlook the way in which topic modeling enables us to interrogate a corpus with informed questions.
In our meeting, Stephen emphasized that a potential strength of the digital is that we do have to make transparent what we did to arrive at these conclusions. Our processes and practices are made much more obvious and are under greater scrutiny because they involve new and complex techniques. The need to “open the black box” comes not just to explain these processes, but also makes it possible to begin to discuss methodology in the context of making an argument. Traditional historians don’t always describe their work in this manner, despite the fact that (as we read two weeks ago) the majority of historians have changed these processes to include computational processes like “search”.
As historians we are meant to be questioning, and aware of, our own assumptions. Topic modeling is another way of highlighting what those assumptions may be.