• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

How Computer Analysis Uncovered J. K. Rowlings' Secret Novel

Status
Not open for further replies.

Quick

Banned
J.K. Rowling was recently revealed to have written The Cuckoo's Calling, under the pseudonym of Robert Galbraith (NeoGAF thread). Before The Times ran the story after receiving a tip about it, they investigated and got some help analyzing the book.

Not necessarily groundbreaking stuff, but certainly an interesting read, especially for anybody interested in language. The analysis wasn't unanimously saying it was J.K. Rowling, but she was the most consistent result.

From PopSci:

The Cuckoo's Calling, a detective novel by first-time author Robert Gailbraith, just got solved in a big way. This weekend, the U.K.'s The Times reported The Cuckoo's Calling was actually written by J. K. Rowling, the creator of Harry Potter. Rowling even admitted to writing the novel after The Times asked her directly.

Among the evidence The Times presented to Rowling were analyses from two university professors who had written computer programs to uncover who authored disputed texts. After all, every writer has her habits. One obvious one is the use of regional words—a car "boot" versus a "trunk," for example—but others are much more subtle and unconscious. It's totally creepy, but cool, that a computer program is able to pick them out.

The Times originally asked the programmers to check The Cuckoo's Calling out after receiving an anonymous tip that Rowling might be the book's true author. The Times reporter, Alexi Mostrous, didn't initially let the professors know why he wanted them to compare The Cuckoo's Calling to several other novels.

So what habits give authors away? One of the analyzers, Patrick Juola of Duquesne University in Pittsburgh, has written a detailed blog post about how his program works. The full post is a great read, but here are the highlights.

Basically, Juola got a digital copy of The Cuckoo's Calling, plus digital copies of novels by Rowling and three well-known authors of mystery novels. He then ran a series of analyses that told him which of the authors the habits in The Cuckoo's Calling matched best. Each analysis looked at a different "habit" in the books:


  • Juola looked at the distribution of word lengths in each book. That is, he got a bunch of numbers like, "X percent of the words in this book are exactly Y letters long."
  • Juola looked at the 100 most common words in each book.
  • He looked at pairs of words that often appeared together.
  • He looked at groups of four characters that appear in a string. Any four characters in a string may do, including letters, spaces and grammatical marks. Now, I don't know of any writers that ever think about character strings in their writing, but, Juola said, other studies have proven four-character strings, called four-grams, are strong indicators of authorship.

Juola's overall analysis isn't able to prove authorship, he said. Some of the individual tests found authors other than Rowling were the best match. Nevertheless, Rowling came up the most consistently. Juola called his work "suggestive" or "indicative" that Rowling wrote The Cuckoo's Calling. The smoking gun came from Rowling's confession, which Juola's analysis surely helped convince her to give.
The distinction matters because linguists use tools like Juola's and others' to determine who actually wrote everything from historical texts by long-dead authors to contested documents in modern court cases. In those cases, it can be a lot harder to get a ready, reliable confession.

Language Log (University of Pennsylvania)

And a follow-up:

Patrick Juola's guest post on identifying the authorship of The Cuckoo's Calling (now number 1 in the Amazon hardback bestseller list) was fascinating. But I seem to be the only person in the world who picked up the secret message that Joanne "J. K." Rowling sent when she picked the pseudonym under which she would publish her first crime novel. It is amazing that no one else picked up on it, but there we are: it was just me. I saw it as soon as… well, as soon as the Sunday Times revealed their discovery of the novel's pseudonymous nature, actually, which is not quite as good as seeing it before the story was all over the newspapers, but I still think I deserve a lot of credit for my penetrating intelligence. I can't imagine why I don't do crosswords; I'd probably win prizes.

The clue was in the collocations of the surname. The most famous Galbraith in the whole of Rowling's lifetime, without any reasonable doubt, was John Kenneth Galbraith, the Canadian liberal economist, US diplomat under Kennedy, and professor of economics at Harvard. Initials: J. K. Now that I've pointed it out, how could you have missed it? Kick yourself.

P.S. It has been pointed out to me that there has sort of been some sort of flicker of recognition in the Twittersphere, e.g. here for example; but pooh to that. People always try to steal truly great insights, if necessary by reversing the unidirectional flow of time, and this is just one more example of such anti-temporal party-pooping.
 

mcfrank

Member
Pretty fascinating the way this all played out, thanks for posting this. I am about half through The Cuckoo's Calling and really enjoying it so far.
 

Shambles

Member
So they were given 4 books and were told one of the 4 was written by the same author as the book they were comparing them too? Doesn't seem very impressive. Any adept reader could probably do the same. Now if this software pulled this fact out of a database of thousands of works of literature it would actually be impressive.
 

mcfrank

Member
Yeah I don't think the analysis really did much in solving the case, it was solved by:

A.) Anonymous twitter person (would love to know who this is) tells the journalist it was Rowling.
B.) Nom de plum has the same agent and editor as Rowling (which would likely not happen for a first time author)

Analysis is still interesting, but I would like to know who the twitter leak is and if it was intentional from the publisher to finally start selling books.
 

Pagusas

Elden Member
im of the assumption she intentionally wanted it known after the book had been out a little while. Seems like an obvious way to boost its sales after the initial experiment of releasing it without the name recognition.

mcfrank said:
A.) Anonymous twitter person (would love to know who this is) tells the journalist it was Rowling.

Probably someone she told to reveal the information.
 

Quick

Banned
So they were given 4 books and were told one of the 4 was written by the same author as the book they were comparing them too? Doesn't seem very impressive. Any adept reader could probably do the same. Now if this software pulled this fact out of a database of thousands of works of literature it would actually be impressive.

I'd definitely like to see a more comprehensive study, comparing more than four books in the future. It can certainly be a helpful tool, not just for casual studies like this.

Huh. This speaks to humans in general. We all have a "fingerprint" of how we write

I'm always conscious of how I write, and I keep a mental note of words I use the most, or patterns in writing in general.

Funny enough, I was watching G.I. Joe: Retaliation (lol), and it was fascinating to see them analyze the president's speech and gestures to compare to Zartan's. First thing I thought of after reading the article.
 

Verdre

Unconfirmed Member
im of the assumption she intentionally wanted it known after the book had been out a little while. Seems like an obvious way to boost its sales after the initial experiment of releasing it without the name recognition.



Probably someone she told to reveal the information.

I think it's more likely that her publisher included such a clause so that after she got her honest reviews, they could still get their Rowling money.
 

robochimp

Member
So they were given 4 books and were told one of the 4 was written by the same author as the book they were comparing them too? Doesn't seem very impressive. Any adept reader could probably do the same. Now if this software pulled this fact out of a database of thousands of works of literature it would actually be impressive.

That doesn't sound like what their algorithms are meant to do. You're only going to be comparing contemporaries and for historical works you're only going to have a handful of suspects.
 

Verdre

Unconfirmed Member
Rowling answered the question of marketing ploy, etc on the website for Galbraith:

Was revealing the true identity of Robert Galbraith not simply an elaborate marketing campaign to help boost sales?
If anyone had seen the labyrinthine plans I laid to conceal my identity (or indeed my expression when I realised that the game was up!) they would realise how little I wanted to be discovered. I hoped to keep the secret as long as possible. I’m grateful for all the feedback from publishers and readers, and for some great reviews. Being Robert Galbraith has been all about the work, which is my favourite part of being a writer. This was not a leak or marketing ploy by me, my publisher or agent, both of whom have been completely supportive of my desire to fly under the radar. If sales were what mattered to me most, I would have written under my own name from the start, and with the greatest fanfare.

At the point I was ‘outed’, Robert had sold 8500 English language copies across all formats (hardback, eBook, library and audiobook) and received two offers from television production companies. The situation was becoming increasingly complicated, largely because Robert was doing rather better than we had expected him to, but we all still hoped to keep the secret a little longer. Yet Robert’s success during his first three months as a published writer (discounting sales made after I was found out) actually compares favourably with J.K. Rowling’s success over the equivalent period of her career!

Source: http://www.robert-galbraith.com/
 

mcfrank

Member
Source of Twitter leak revealed

When The Sunday Times first broke the news about J.K. Rowling being the real author of The Cuckoo's Calling, it was mentioned that a tip on Twitter had led to the eventual reveal. According to the Huffington Post, that anonymous Tweeter has now been traced. London legal firm Russells stated on Thursday that Chris Gossage, a partner at the firm, had informed his wife's close friend Judith Callegari about who Robert Galbraith really was. Callegari subsequently posted a Tweet on the subject, although her account has now been deleted. In a statement to Rowling, Russells apologised. Rowling and her agent were immediately informed once the firm learned what had happened, with Russells confirming that the leak was not part of a publicity plan by Rowling. For her part, Rowling has made her feelings clear. According to the Huffington Post:

Rowling said that "only a tiny number of people knew my pseudonym and it has not been pleasant to wonder for days how a woman whom I had never heard of prior to Sunday night could have found out something that many of my oldest friends did not know." "To say that I am disappointed is an understatement," she added. "I had assumed that I could expect total confidentiality from Russells, a reputable professional firm, and I feel very angry that my trust turned out to be misplaced."
 
Status
Not open for further replies.
Top Bottom