How I Used Natural Language Processing to Prove Harry Potter Belongs in Slytherin
The quickly developing field of Natural Language Processing feels like a shiny new playground: models can generate anything from a story by Neil Gaiman to inedible pasta recipes, and those are just two examples from the past few months. When I first entered the field in 2019, I knew right away that I wanted to create a model that combines my love of NLP with my love of Harry Potter.
If you’re not familiar with the source material (in which case, wow, have I got some weekend plans for you), the Harry Potter series takes place at Hogwarts, a school of witchcraft and wizardry, where students are sorted into one of four houses based on their personality. While the concept of having your future dictated by a few characteristics at age 11 is frowned upon even among fans, it’s a great opportunity to implement a simple-yet-effective model.
That’s how I came up with The Hogwarts Sorting Configuration — a model that associates character behavior with one of the four Hogwarts houses. Using this model, we should be able to decide whether characters were sorted into the right houses, and sort additional characters based on their characteristics.
Methodology
The dataset for this experiment is the first book in the series, Harry Potter and the Philosopher’s Stone, as that is the year our main characters were sorted into houses. The experiment is relevant to a handful of main characters where there is enough data on their actions and personality to draw conclusions. It’s also important to remember the book is told from Harry’s perspective, and is thus affected by his opinions on other characters.
In addition to phrases declared during the sorting ceremony when presenting each house, to try and mimic training on past data, each house was also trained on a group of characteristics attributed to its professors and founders. That way we can attribute, as the song goes, “cunning” and “use of any means” to Slytherin, but also Parseltongue (speaking with snakes) and lust for power, often attributed to Salazar Slytherin himself.
The book was divided into 4,061 utterances, approximately 10% of which are describing a Hogwarts student or their actions. Each of these sentences was tagged with the character mentioned, and the model, where relevant, associated a house with their actions, based on the characteristics assigned with it in the training stage. If they did something that was described as brave, the utterance would be sorted into Gryffindor; if it was friendly they’d be Hufflepuffs; “those of wit and measure” are Ravenclaws; and as mentioned before, the cunning and power-seekers are belong in Slytherin.
At the end of the tagging process, each character’s utterances are summed up to decide where they belong, based on their actions rather than a sorting ceremony.
Hypothesis
Our three main characters, Harry, Ron and Hermione, are all Gryffindor students — arguably, the most vague of the houses, where students are set apart by their “daring, nerve, and chivalry”. Having known the characters for more than half of my life, I hypothesized that neither of the three would be sorted into Gryffindor — Hermione would be a Ravenclaw, Ron would be a Hufflepuff and Harry would be, as the title of this article suggests, Slytherin.
Results
The Hogwarts Configuration placed Hermione in Ravenclaw. This is no surprise to anyone who is familiar with the character. In fact, over 60% of Hermione’s actions and descriptions in the book have to do with Ravenclaw characteristics. Hermione even introduces herself to the other characters on the train to Hogwarts saying “I’ve learned all our course books by heart, of course, I just hope it will be enough — I’m Hermione Granger, by the way” — they know she’s book-smart before even learning her name.
The Hogwarts Configuration placed Ron in Hufflepuff. Hufflepuffs are just and loyal, and Ron Weasley, in the first book of the series at least, is described as just that. He is a good friend to Harry at every turn: guiding him as he enters the magical world for the first time, teaching him wizard’s chess and staying at school during the holidays so Harry’s not alone. In second place, not far behind, The Hogwarts Configuration still hasn’t placed Ron in Gryffindor, but in Slytherin. When Ron looks in the Mirror of Erised, which shows you what you most desire, he sees himself yielding all the power his older siblings have — a prefect in school, captain of the quidditch team. I believe if I were to conduct this experiment on all seven books, Ron could be placed in Slytherin overall, as these qualities develop in him as the series progresses.
Harry’s case was a little more complicated. There’s a good case to be made to sort him into Slytherin; he was almost sorted there in the book. Harry enjoys the power and fame that come with being “the chosen one”, not to mention his natural ability to talk to snakes. He’s also the most Gryffindor of the three, after all, he asked the sorting hat to put him there. He’s the only character of the trio whose actions are constantly described as brave, like when he’s standing up to Crabbe and Goyle, speaking “more bravely than he felt”. The Hogwarts Configuration placed Harry in Slytherin — but Gryffindor came a very close second.
So… Is anyone a Gryffindor?
The Hogwarts configuration confirmed my hypothesis that Harry, Ron, and Hermione belong in different houses. Then I started wondering if the problem could be Gryffindor house itself. Could it be that no character is more brave than they are smart, loyal or cunning? Or maybe our definition of the houses is wrong, and the characters might be sorted based on qualities they appreciate rather than ones reflected in their personality? I ran different characters in the book through the configuration to try and figure this out; I had different assumptions as to who would be a Gryffindor — Professor McGonagall, Lee Jordan and Oliver Wood, to name a few, but knew the majority of those characters do not have sufficient data in the first book to qualify for this experiment.
Right before I was about to give up, the configuration came up with the quintessential Gryffindor: Neville Longbottom. At first glance, you may think Neville is the opposite of brave, always being described as hesitant, his voice quivering before his first broom ride, his face turning red when Malfoy talks down to him. But his actions always prevail: he does get up on the broom, despite his fear; his face turns red, but he turns to face Malfoy anyway, and what is bravery if not doing things despite fear? As Dumbledore says towards the end of the book, “It takes a great deal of bravery to stand up to our enemies, but just as much to stand up to our friends,” and Neville does both, making him the ultimate Gryffindor.