In 1997, LSA Professor of English Eric Rabkin was invited to a Michigan seminar to discuss the emerging field of complex systems, a method of study that allows researchers from a variety of disciplines to use advanced mathematics and computer modeling to solve difficult, dynamic problems. The work covers everything from the growth of terrorist networks to the spread of disease. Complex system modeling was being applied to biology, epidemiology, computer network design, and economic decision modeling, but not yet, as far as Rabkin could tell, to any areas of cultural research. Art. Anthropology. And, yes, English.

Could the same models that follow and predict the spread of the influenza virus also track the evolution of literature, he wondered?

At that seminar, Rabkin met LSA Professor of Economics and Mathematics Carl Simon, one of the experts on complex adaptive systems at U-M. “After the meeting, he came up to me and said he had some idea of how this might work in studying literature,” Simon recalls.

Tentative though the ideas may have been, it was enough to launch what would eventually become the Genre Evolution Project (GEP), a collaboration to use advanced mathematics and computing to understand the cultural impact of science fiction.

Now, after 14 years, nine presentations, seven published papers, and more still in the pipeline, the largest portion of the GEP is complete. It’s now possible to say whether Hugo and Nebula award winner Ursula K. Le Guin was a pioneer for women in science fiction, or part of a general trend. Or to say whether one’s publication odds for a science fiction short story are better if the story is about aliens or medical technology.

But how did this happen? And how does one actually code stories?

The first meeting of the GEP was held in Angel Hall in the winter of 1998. “The first exciting piece was the understanding that maybe we could link literary genre and complexity,” says Simon. “The second piece was having the students build the structure.”

Linking literary genre and complexity was something Rabkin was all too ready to try. Through his own research, Rabkin had begun to suspect that just studying the masterworks was too narrow to understand literature’s cultural impact. Complex systems appeared to offer a way to view a genre holistically. “This always had been clear to me: the traditional approaches of literary criticism are incomplete,” says Rabkin. “Everything that functions in human culture functions in a larger context. To be able to answer even so simple a question as, ‘is this a good book?’ is improved if you can find a way of looking at that larger context.”

Together, Simon and Rabkin came up with the idea of treating every story in a given literary genre like a pseudo-organism and the publication process like biological evolution. Each story is treated as a very complex organism consisting of traits ranging from the simple, such as publication date, number of characters, and presence of space travel, to the more subjective, such as complexity of the main character and theme development. If the story is well suited to the environment of editors and readers, the story gets reprinted. “If a story is published in a science fiction magazine, then a year later it is in an anthology, then three years later it is in another...people keep thinking, ‘yes, this is a story we want to bring before the public.’ That’s a measure of evolutionary successes,” says Rabkin.

“It’s sort of like having a DNA string for these stories,”  says Simon.

Covers: 1984 and Dune courtesy of The Penguin Group; The Hitchhikers' Guide to the Galaxy courtesy of Simon & Schuster.

Student volunteers read the stories in pairs each week, then broke the story down into key characteristics. The results were then compiled in the database for analysis. As of this year, student research assistants have read, analyzed, and created a database of almost 3,000 American science fiction short stories written between 1926 and 1999.

Unlike most other research projects, the students doing the legwork had a strong influence on the project’s definitions and goals. “To a large extent, the students set the ground rules,” says Simon. “They set the categories, set the definitions, chose the boundaries.”

The approach wasn’t without its drawbacks. Not every story fit neatly into chosen criteria. H.G. Wells’ The Time Machine, for example, could be considered both a story about time travel and about exploration.

“Inter-coder reliability is difficult,” explains Simon. “If two smart people would read the same story, would they come to the same conclusions? I think the answer is: not always. Hopefully enough stories make up for that. We kept refining the definitions to make them sharper.”

While they worked, the group had to deflect criticism for in-depth study of a genre many in the academy felt was too low-brow. Rabkin, however, knew the value, having published a book and several papers on science fiction before the start of the project. He knew how deeply ingrained the genre is in American culture. “Science fiction is something that not only produces stories, but also turns out to be the underlying genre for the vast majority of box office movies. It also influences city planning and popular music,” Rabkin says. He also argues that science fiction is the only genre explicitly engaged in understanding the consequences of new technology and the uneven distribution of knowledge—topics relevant in the current culture. “Of course science fiction is better suited to deal with these issues than the other genres. It’s the genre that’s supposed to deal with them.”

Armed with a unique database of quantifiable evidence, the GEP has produced several papers and presentations over the past decade. It is now possible to debate about the genre of science fiction without being restricted to abstract concepts. For example, in a paper published in 2008, Simon and Rabkin were able to show the science fiction stories that women wrote had the same characteristics as the stories male authors wrote, including the likelihood of a hard-science background. The exceptions were that women authors wrote shorter stories with younger main characters, and half the stories by women authors had female lead characters, compared to only five percent of the stories by male authors.

The work is nearly completed, and the weekly meetings were reduced to once a month in 2011. In the summer of 2012, the meetings were suspended altogether. Side projects are continuing, however.

Simon and Rabkin hope other schools will benefit from their collaboration and that it might even be modeled at other research institutions. “Michigan really has the thinnest wall of any university I know,” says Simon. “This kind of thing, the fact that it rose from this interdisciplinary sort of meeting of minds, where people purposely got together to see how ideas in other fields might affect how they think and about what they do—that’s a real Michigan thing. That was the catalyst for this whole project.”

Learn more about the genre evolution project by visiting

Sci-Fi Best-Sellers

The top five best-selling science fiction books of all time are, by their very nature, widely popular. But the list may not reflect some of the timeless classics that helped define the genre. 

What sci-fi novel would you get into the hands of more readers if you could? What's missing from this list? Tell us in the comments section below.

The Lord of the Rings by J.R.R. Tolkien: 150 Million Copies

The third best-selling novel of all time, it was originally supposed to be a sequel to Tolkien's The Hobbit but turned into a longer, more complex work written over 12 years. Tolkien wrote it as a single novel, but the publisher turned it into a trilogy due to post-war paper shortages.

The Hobbit, or There and Back Again by J.R.R. Tolkien: 100 Million Copies

This children's classic was nominated for a Carnegie Medal when it was first published in 1937. The story follows the quest of hobbit Bilbo Baggins as he tries to win a share of the treasure guarded by Smaug, a dragon.

1984 by George Orwell: 25 Million Copies 

Some of the terms from this novel about people being tyrannized by a totalitarian government—Big Brother, thoughtcrime, Thought Police—have remained part of the popular lexicon. Even the term Orwellian is still used today to describe a policy of surveillance, propaganda, and deception. 

The Hitchhiker's Guide to the Galaxy by Douglas Adams: 14 Million Copies

This comedy chronicles the adventures of hapless Englishman Arthur Dent, who escapes the destruction of Earth and rides aboard a stolen spaceship to find the spaceship to find the question to the ultimate answer.

Dune by Frank Herbert: 12 Million Copies

Dune addresses politics, religion, ecology, technology, and human emotion through the story of young Paul Atreides. He and his family relocate to a planet that is the only source of the most important and valuable substance in the universe.