Public Perception of COVID-19 Vaccines & Associated Political Response on Twitter

 As the pandemic wore on in the United States, COVID-19 increasingly was treated as a political issue and met with partisan responses. As vaccine studies progressed and vaccines became approved, this division amongst ideological lines followed. By February of 2021, 37% of Republicans believed a widespread vaccination would help the economy compared to 66% of Democrats [1]. Likewise, there is increased vaccine resistance and hesitancy amongst conservative or right-leaning states, with about 50% of the state being resistant or hesitant [2]. Popular concerns for taking the vaccine are possible side effects or the perceived rushed timeline of the vaccine research and distribution.

With this information, for my project, I’m turning to Twitter to understand how does the public perception of the three major U.S. Vaccines (Pfizer, Moderna & Johnson+Johnson) compare, regarding side effects? I will be also researching the political divide within my corpus for differences in attitude surrounding vaccines.


With RStudio & Word2Vec package, created by Professor Benjamin Schmidt, we created word embedding models of our corpora. When it searches for a word, it will find the closest relating word based on distance in a vector space, on a scale from 0.0-1.0. 1.0 indicates an exact match. The words are clustered based on the unique frequency with each other. You can also subtract words from a query to return words that have the longest distance/difference between the two. With this, complex queries can be built to find words unique to a set of words, but not another.

Corpus Creation

 My corpus is composed of tweet texts from two datasets found on Kaggle.  [3][4] The first is a collection of tweets from Dec. 12, 2020 to April 3rd, 2021, scraping tweets with hashtags to relevant vaccines. The second set is scraped tweets from August 30th, 2020 to Mid-March under the #CovidVaccine hashtag.  Using Python, I cleaned up the actual tweet content and removed mentions, links, hashtags (kept the words), lowercased text, and punctuation. I removed the common tweets between the databases and treated those tweets as a separate file. I also separated common joined words (CovidVaccine -> covidvaccine), removed unnecessary “variations” of words (PfizerBiontech -> pfizer or johnson+johnson -> johnson) and also concatenated some words (Side Effects -> sideeffects, Boris Johnson -> borisjohnson). My final cleaned corpus contains 3,067,557 words.


Side Effects

From a preliminary search of “side-effects”, the common side effects reference mild types of pain; soreness, fatigue, lethargy. There is a stronger connection to the first shot than the second shot. However, this could stem from a limited amount of people having both doses, as well as one of the major shots being one-dose only. None of the side effects indicate anything serious.

Next, I decided to pair sideeffects with the search term for each of the respective vaccines.

From the Pfizer Vaccine, there weren’t any striking insights. There was about a .10 difference between fromfirst to fromsecond, suggesting that side effects followed from the first one more than the second one, however, it’s not strong enough to really conclude anything.

covid vaccine some experience delayed rashes after moderna report says

covid vaccine day 2 i notice sporadic rashes on my temples and cheeks an allergic reaction jab site still sore

36 hours post moderna vax dose 2 my arm has a nonitch or pain rash its sore i hope day 2 is as good a few


The Moderna query was the only one to return an actual symptom of rashes. The difference between post2nd and fromfirst is smaller than Pfizer, but not enough to conclude a difference between the two. Though, the fromfirst returned a similarity around .74, similar to the .72 score from the Pfizer search.


Johnson & Johnson’s query also didn’t return any symptoms. Notably, the fromfirst has a similarity score of .65, .10 less from both Moderna and Pfizer. This could suggest fewer side effects overall, however, this would need to be backed with further research.

These preliminary insights suggest that the Moderna vaccine is associated with more side effects and that the Johnson and Johnson vaccine may result in comparably fewer side effects. Using the power of relational word vectors, the next slew of searches focus on the closest unique words to each respective vaccine and side effects, but not on the other vaccines. (Vaccine1 – Vaccine2 – Vaccine 3 + sideeffects).

Right off the bat, Unique Moderna words are associated with a lot of explicit side effects, including soreness, stiffness, exhaustion, pain, headaches. All of these have a similarity score of at least .5, which is weaker but is stronger than all of the results from the upcoming searches. It’s interesting that soreness has a .52 similarity score because typically all shots result in some soreness due to the immune system’s response of producing antibodies.  This score is particularly high for Moderna compared to the .39 of the Pfizer Vaccine, and the non-existence for the Johnson & Johnson Vaccine. The Pfizer Vaccine includes pain, hesitation, injection but these all have weak similarity scores. These do indicate some pain, but not as extensive as the Moderna vaccine. The Johnson & Johnson query produces nothing really of insight and is mostly noise.

moderna sideeffects hitting hard today feels like day 2 or 3 of legit covid rn

i’m feeling moderna sideeffects this morning anthonyfauci was right the second dose does effect you more

how long have the sideeffects lasted for those of you who received both doses of moderna

According to a report by the FDA and summarized by Business Insider, less than 50% of Johnson  & Johnson Vaccine recipients reported pain at the injection site. This is considerably less than Pfizer’s 84% of recipients reporting pain and 92% of Moderna recipients. [5] These general patterns align with the findings by the computational text analysis, which also gives credence to the first dose producing more side effects. Again, it is important to note, this could be due to more people have received the first dose than the second dose at this point in time. Indeed, the second tweet above recalls the second dose did affect the user more than the first.

Political Divide
By President






Vaccines started to become operational and more accessible to the public in late/early 2020 when President Biden came into Office. In a search relating to the closest words to Vaccine and Biden, but not Trump; goodnews and stockmarketnews is the most interesting. goodnews would indicate that Biden is more positively associated with Vaccine news, while Trump is associated with his rhetoric, not necessarily actions around the vaccine.

covid covid vaccine mass vaccine sites open nfl offers biden all 30 stadiums goodnews accessibility

covid vaccine feels like world is turning into a better place trumpout stockmarketnews

vaccine covid vaccine donald trump says he wont make covid mandatory goodnews

covid vaccine don’t get your hopes up on trump rhetoric

nowthisnews it’s not just a matter of trump lying he’s dangerous his rhetoric causes death and destruction

The only tweets containing goodnews and a political figure is tweets 1 and 3 above. Tweet 3 is interesting because it is actually complimenting Trump’s vow; so it’s interesting to see good news being specifically associated with Biden. Twitter hosts a lot of different ideologies but is definitely left-leaning. rhetoric is an interesting result because Trump is usually associated with his way of speaking to divide and attack different groups. In general, these results align up with the idea that Biden is handling COVID-19 better than his counterpart. To note, vaccine distribution did speed up around December/January, so it is unknown how Trump’s handling of the Vaccine would play out if he stayed in office and if these results are a cause of the timing, not the President. While he was in office, there was a lot of criticism –  as once he left, it was reported the Trump administration had no vaccine distribution rollout plan.

By Party




In a similar vein, I decided to search for unique words relating to each political party and the vaccine. For Democrats, unique hashtags were covidmillionaires and factsnotfear. For Republicans, a unique hashtag was trumpliedpeopledied.

covid covid vaccine covidmillionaires covidmillionaires see those you steal from be ashamed

covid vaccine jaywalk90075373 yes its all about climatechange covidmillionaires control if you all

covid vaccine just say no covid covidvax covidmillionaires

#CovidMillionaires is used as an attack against vaccines, and #FactsNotFear is used as a prop against fear-mongering, on both sides of the political spectrum.

covid vaccine 2nd dose done i’m ready for concerts now ‍ covid vaccine factsnotfear

covid vaccine i’m not an antivaxxer i’m pro truth there’s a difference factsnotfear

covid covid vaccine avoiding safety protocols for your safety factsnotfear

covid vaccine day 71 post vaccine appointment done janssenvaccine unblinded igotthevaccine factsnotfear

These hashtags aren’t indicative of how each party handled the vaccine rollout, but how each party feels about the other party. People use Twitter to air complaints about the opposing party and don’t reference their own when tweeting.

Conspiracy Theories

Lastly, looking purely at the semantic reference of vaccine and vax, vax is largely used in a conspiracy nature. Vax is associated with terms relating to genetic modification, and operating systems. This is likely because of the popularized moniker of AntiVax for AntiVaccine. There is a large conspiracy by anti-vaxxers that the vaccine will insert chips and alter your DNA, somehow spearheaded by Bill Gates. It’s interesting if this connection is from anti-vaxxer’s themselves referencing themselves as anti-vax and the different theories, or non-anti-vax referencing anti-vax and their theories. In actual tweets, vax is used both in a regular and conspiracy sense. It would be odd for someone to reference themselves as anti-vax and then promoting the theories.

36 hours post moderna vax dose 2 my arm has a nonitch or pain rash its sore i hope day 2 is as good a few

covid vaccine who knew being fat would actually help me out one day got my 1st dose of pfizer vax

covid vaccine niagindependent dr wakefield warns this is not a vax it is irreversible genetic modification”

covid vaccine this is especially dangerous this mrnavaccine shot is not just a vax its geneticmodification

From these limited tweets, when associated with vax and conspiracy theories, it is used by people who are promoting the conspiracy theories. It’s interesting how certain groups, even those created in the online environment, develop their own linguistic style and abbreviations towards their subject matter.

Conclusion & Further Research

To answer the original question of how the different vaccines compare when regarding side effects, Twitter users indicate that Moderna provides more pain after the first shot. Additionally, the first shot gives more pain compared to the second shot, although this could be the causation of time. The Johnson & Johnson Vaccine seems to give little side effects and pain. Research by the NDA backs these findings up, giving credence to the Twittersphere commentary.

Next, Tweets associated with Biden tend to trend on the more positive side but could be the result of the general timing of Vaccine rollout. This would present an interesting base for future analysis into whether time is a primary factor in the sentiment around his actions. Trump was linked to his rhetoric, relating to his ability to take advantage of people’s frustration and promote political polarization about almost any topic.

Regarding political parties and vaccines, the terms were used to attack or question the party of focus handling of the vaccine. More often than not, the opposing party was the one responsible for tweeting about the other party. This could be expanded into another project, looking at how followers of each party behave on Twitter in relation to the other party.

Lastly, “vax” is used by anti-vaxxers when referring to vaccines.


The dataset only contains a partial view of the tweets, not the full tweets. It would be difficult to scape the full tweets, on this scale, with the free API without running into rate-limiting issues.


