In July 2015, the publication of Go Set a Watchman rekindled global interest in Harper Lee. As her only longer publication apart from To Kill a Mockingbird, the novel made headlines all over the world and sparked controversy as Alabama police investigated charges of elder abuse toward Lee (eighty-eight years old at the time of publication). Leaving aside the artistic merits of the book, which have been questioned by many, it is fascinating to see how the story surrounding Lee’s writing of the novels and the validation of their authorship received a life of its own. Two years ago, I was given the opportunity to be part of a research team conducting a literary investigation of the authorship of the two books.

Harper Lee’s To Kill a Mockingbird and Go Set a Watchman are an odd couple. The latter book is the first version of the former, but it reads like a sequel in terms of plot timeline. After Lee submitted her manuscript to J. B. Lippincott in 1957, Tay Hohoff, a cat-loving, chain-smoking editor, famously said that the novel looked more like a “series of anecdotes than a fully conceived novel.” After a painful process of revision, the manuscript became Go Set a Watchman, which subsequently metamorphosed into the To Kill a Mockingbird we all know today. Yet, when the book became a blockbuster, rumors surfaced that Lee had relied heavily on the guidance of her editor, as well as her childhood friend Truman Capote—by that time already an experienced writer and a character in the final version of the novel. Even now, as argued by Charles Shields, “fifty years after To Kill a Mockingbird appeared, the rumor persists that Nelle Harper Lee didn’t write the novel herself. Truman Capote, so goes the whisper campaign, wrote large portions—or maybe all of it.” Capote never openly denied these speculations; interestingly, he and Lee parted ways when he failed to properly acknowledge her contributions to the manuscript of In Cold Blood. The recent publication of Go Set a Watchman has created the opportunity to put said speculation to rest with stylometric authorship attribution.

In our latest study (an early version of the study was conducted by Eder and Rybicki for the Wall Street Journal), we visualized the patterns of similarity and difference among the more or less plausible rivals to Harper Lee in the authorship of her bestselling novel (and several other authors of the American South for background). The diagram below, called a network analysis, shows the strength of similarity between texts by those writers. The closer two texts are, and the thicker the line that joins them, the more resemblance there is between them.

Fig. 1. Bootstrap consensus network with Harper Lee’s novels next to other authors of the American South. Reproduced by permission of the Mississippi Quarterly.

Given the strength of the connection between Lee’s Watchman and Mockingbird (the thick light blue line), we have absolutely no doubt that the two texts were authored by the same person. Neither Tay Hohoff, with her Cats and Other People and A Ministry to Man, nor Truman Capote, with his various works, comes close to Lee. This visualization also shows a very frequent result in this type of “reading by counting”—quite often, an author’s stated preference for another is reflected in their relative similarity. This is the case with Lee and one of her favorite writers, Eudora Welty.

Stylometry also makes it possible to look for authorial traces within a single work authored or edited by more than one hand. Maciej Eder’s “rolling classify” technique tests the authorial signal of consecutive samples of a novel against that of individually written texts by the candidate authors.

Fig. 2. Rolling Classify prodedure on Harper Lee’s Mockingbird. Reproduced by permission of the Mississippi Quarterly.

In the diagram above, three authorial signals are juxtaposed against one another on the axis of story development for To Kill a Mockingbird. The horizontal line represents the unfolding text and consecutive chapters. While—as can be seen from the previous analysis—the overall authorship of both books leaves no doubt, the authorial fingerprints in particular sections of the book seem to belong either to Harper Lee (in green), dominating most of the text, or, in some fragments, to Tay Hohoff (in blue). While this mixture is quite a natural effect of the women’s two years of collaboration, the presence of Truman Capote’s signal (in red) in the opening chapters of To Kill a Mockingbird is somewhat surprising. Could this residue be the outcome of their shared narrative of their upbringing in Monroeville, Alabama?

Did Capote indeed help his younger colleague a bit with the very opening sections of the book? This stylometric study encourages us to give more thought to Capote’s role in the early stages of Lee’s writing endeavor. Of course, the three writers concerned are no longer among us, so we will never know for sure what transpired. With stylometric analysis, some speculations are put to rest, while others are born.


This text has been adopted from a blog entry by Michał Choiński written for Louisiana State University Press.

More about the stylometric study of Harper Lee:

Choiński, M., Eder, M. and Rybicki, J. (2019). Harper Lee and other people: a stylometric diagnosis. Mississippi Quarterly, 70/71(3): 355–374.

Eder, M. and Rybicki, J. (2016). Go set a watchman while we kill the mockingbird in cold blood, with cats and other people. Digital Humanities 2016: Conference Abstracts. Kraków: Jagiellonian University & Pedagogical University, pp. 184–186, http://dh2016.adho.org/abstracts/70.