Books Vs Show

Do people prefer a book or the TV show it spawns? This is a question that a lot of people have extremely intense feelings about on both sides, and it is an impossible question to answer because of the obvious potential for varying quality. 

A series I have recently been reading The Expanse series by James S.A. Corey and it has been an absolutely incredible read so far. I can’t put the books down in spite of the fact that until I finish the one I’m currently reading, Nemesis Games, I generally have always known what would happen because there is a TV show about the series on Amazon prime that first pulled me into this new world. What is interesting is that both the books and the show are not yet complete, so no one ever knows exactly what is going to happen — there’s room for things to change as long as there is not a finished narrative. Based on this, I’m curious to read into public reception of both series and determine if fans tend to consume them as separate products or engage with them simultaneously, and if they engage with both is there a version that tends to dominate discussion indicating preference. I will attempt to answer this question by finding the context in which characters tend to be used, especially those with changing roles, and by attempting to find the usage for various narrative tools and places that exist in both.

Building a Model

I will attempt to achieve this by leveraging a natural language processing technique called word2vec which creates a predictive model that looks for words used in similar situations to each other — this functions similar to synonyms, but tends to produce results more along the lines of words similar categorically and not by definition. For example, if a model like this were built on data from cookbooks it would likely link various kinds of leafy vegetables together because even if they’re entirely different plants they tend to mostly wind up being used in similar ways in a meal. For my model, I will build it using comments from the Subreddit for The Expanse which conveniently focuses on both the books and the show.

To gather the comments I first needed to gain access to the Reddit API which turned out to be pretty trivial. Once I found the right place to do it because all I had to do was indicate I wanted to create a script and the keys needed were immediately provided. After this I experimented with trying to directly access it through a Python script, but it eventually became clear that it would be much easier to just use a pre existing library called The Python Reddit API Wrapper, PRAW. Once I started using that it was pretty easy to loop through all the recent posts, collect all the comments, and add them into one long text block that I could then save to a file for my corpus. I didn’t bother with trying to create separate files because the likely next step would be creating files by post, which would have ended up creating a corpus of hundreds of files that accomplished the exact same thing as if I stored them all in one file. This file was then compressed and uploaded to Northeastern’s RStudio where I was able to use it to create a word2vec model.

In doing this I opted not to put common bigrams like names or terms together as the only ones I planned to use would be names and in a discussion community the context in which a full name is used will likely be different than more commonly referring to them by just part of their name. I was shocked at just how easy the whole model came together once I had collected my corpus. I experimented with different sets of parameters, but with my corpus being well over 1,000,000 words I didn’t feel like it was necessary to add iterations or vectors to get more accurate results. I left the window as is because 6 is already pretty small and I didn’t want to expand it further due to the frequency of relatively short comments that could cause that to throw off results. In the end, I ended up with a strong model that really helped in coming to effective conclusions.



I started by investigating the usage of character names to find out if I could draw any interesting conclusions from that. The first I looked at was a woman with the last name, which she is typically called, Drummer. This is yielded some interesting results right off the bat as her top results were Michio and Pa which are both meant to refer to the character Michio Pa whose book role from Abaddon’s Gate was mostly taken over by Drummer in season 3 of the TV show, while Drummer was just a minor character in the books who at the time had yet to even show up. This indicates to me that most fans of the show are likely fans of the book first and tend to prefer it if the top match for someone who is mostly a TV character is their book parallel. After the first two matches it is more of a mix between ones that specifically indicate book or show preference. Looking further into characters who had their roles change I next searched Bull who also had a significant portion of his role in Abaddon’s Gate shifted to Drummer or simply removed. He also showed up in Drummer’s list, but only at 10th. Bull’s list led to equally pro-book results — Drummer was at the top, followed by Pa. The Pa could be a result of discussion around Drummer absorbing a significant amount of their roles, but without mention of Michio as well it is more likely referring to Bull and Pa’s interactions which were a significant part of the politics of Abaddon’s gate. Them being referenced relative to each other in this manner builds a connection nearly as strong as the discussion of them relative to their TV replacement which likely indicates an even greater book following than if it was just the latter. 

Next up, I decided to look at Dr. Elvi Okoye who was one of the points of view from which the book Cibola Burn was read and also a main character of Season 4. Additionally, she shows up in a future book I haven’t gotten to yet that will be outside the scope of the show which will only cover the first six books. Her top match after her own name as a possessive is Teresa, a character who never appears in the show as she isn’t in the books until the final 3 which extend beyond the bounds of the show. After that she has a few references whose show vs book ties aren’t necessarily clear like Miller who held about equal significance to her in each, but then we get some references involving the protomolecule, and that element of the plot isn’t very relevant to her in the context in which the word protomolecule would be used. I could try and explain further, but I wouldn’t want to spoil anything. After examining these characters it seems clear that the show and book are frequently looked at in parallel based on the seemingly high number of character comparisons, but those discussing it do tend to seem to fall back on the books more often.


After this I decided to look more into the word protomolecule. This is some kind of foreign substance that kick starts the series and looms over it throughout, in both the books and the show. For this one I expected it to be a more balanced result as in general it is treated in a relatively similar manner; however, this did not hold true with book specific language once again dominating the discussion. The top match, after pm which is just a shortened term for protomolecule, was the term Builders which is primarily a book concept and likely indicates discussion informed by the books. After this comes hybrid which is more of a show concept as the books tend to use different language for the monsters described that way, specifically using the word monster quite frequently. This is then followed up by more terms that would be book terms or neutral ones. This was less conclusive than investigating characters, but it still ended up pointing at the book as dominating discussion while not necessarily indicating those discussions were happening relative to each other.


The last thing I looked into was some places that showed up in both the books and the show. The first I looked at was the Rocinante, the ship that the main characters typically spend most of their time in, usually just referred to as the Roci. I expected for a more balanced group of words to show up once again, but this time I was right. Most of the matches, words like Razorback or Pella, were other ships that were large parts of both stories, and the few that weren’t were words like crew or ship that obviously don’t point to either the books or show either. Next up I looked at Ilus, one of the names given, and the most commonly used, of the first newly inhabited planet outside of our solar system in the series. This yielded similar results to the Roci with most of the words just being other planets that were significant in both stories. The only exceptions may be the words flora and fauna as the entirely new ecosystem was focused on more in the books, but was a large part of both. Overall, settings were definitely the most inconclusive thing I looked at as they didn’t clearly indicate a preference for either and the language being similar meant it’s hard to determine if they’re referenced in relation to each other.


Ultimately this went about as I would expect with my analysis indicating a fanbase that consumed both the books and the shows, but the books tend to constantly color the discussion even when it is likely about the show. This lines up with my personal experience of preferring the books, but is interesting to see it demonstrated in this manner. What is interesting is the amount of matches that suggested they were being compared, and I think that illustrates what is so great about adaptations and the changes in them — it inspires fans to delve deeper and search for more. This can be seen here in this article which talks about how film can be used to inspire teen readers because if they enjoy an adaptation enough it could very well inspire them to try the original. I’m a great example of this in action as it had been a while since I read a full series, but now I’ve already gotten most of the way through this one in a little over a month. I greatly enjoyed the show, and even if I didn’t prefer it I still enjoyed it and it sparked my desire to read the books. It doesn’t really matter what people end up enjoying more as long as they’re happy with how they spend their time.  

I think there are a lot of interesting directions this research could take from here. It could focus more on individual stories and how people feel about each which could help facilitate targeted recommendations, or it could focus more broadly on the concept as a whole which could also be interesting and hasn’t been investigated with much academic rigor. Finding out how people feel about books vs shows or movies could help find out why certain are preferred and it could change the approach taken when producing new media in trying to reach the largest audience.