Software Accurately Predicts Books’ Popularity By Analyzing Their Sentences

"I'll never agree! That's worse than being a slave in prison," Ella glared as she went out the door.

Greg Friedler/Getty Images

Maybe this is something we can apply to Popular Science posts? A team of computer scientists has developed software that’s able to predict whether a book will be popular based on its writing style, the U.K.’s Telegraph reports.

The software learned this trick from analyzing 800 books from Project Gutenberg, an online archive of public domain works, and comparing the books’ word use and grammar with how often they’ve been downloaded. For some books, the computer scientists also considered Amazon sales data and awards such as Pulitzer Prizes. The books were of all different genres and types, ranging from novels to poetry, and from love stories to sci-fi.

Some of the qualities the software identified in popular books sound just like what your writing teachers have been trying to tell you forever. Less successful books included more adverbs and “relied on words that explicitly describe actions and emotions such as ‘wanted,’ ‘took’ or ‘promised,'” The Telegraph reported. In other words, they didn’t adhere to “show, don’t tell.”

Other secrets to writing success were less intuitive. For example, successful books contained more conjunctions, such as “and” and “but.” Do successful books have a lot of long sentences?

In a paper they wrote for an Association for Computational Linguistics conference, the software’s makers, a team from Stony Brook University in New York, made this table of “successful” and “unsuccessful” words in adventure books:

So don’t write your next great novel about breathless affairs in beach rooms by the bay. Stick to plain writing, and just let people say the things they mean.

The Telegraph