
The Covid-19 pandemic has confounded medical experts and policy makers alike since it began in 2020. More than a year on, they are grappling with an emerging new mystery involving “breakthrough cases” — people who caught Covid-19 despite being vaccinated.
No vaccine is 100% effective so it’s not a surprise that some vaccinated individuals would still test positive. But it remains unclear how these breakthrough infections occur, whether demographic or environmental factors play a bigger role, and how soon a booster shot is needed in the face of new and more contagious Covid-19 variants.

The addition of the Wav2Vec2 model in Hugging Face’s transformers library has been one of the more exciting developments in NLP in recent months. Until then, it wasn’t easy to execute tasks like machine translation or sentiment analysis if you only had a long audio clip to work with.
But now you can link up an interesting combination of NLP tasks in one go: transcribe the audio clip with Wav2Vec2, and then use a variety of transformer models to summarize or translate the transcript. …

AI can’t write great poetry on its own (yet?). But it can now transcribe a poetry recital really well, if results from the Wav2Vec2 transformer model is anything to go by.
My trials using audio clips ranging in length from 62s to 12.5 minutes, including the evocative Inaugural Poem by youth poet Amanda Gorman, turned up pretty impressive results.
Efficient audio-to-text transcription has been one of the “missing links” in the modern Natural Language Processing (NLP) toolkit. Not anymore it seems, thanks to Hugging Face’s implementation of the Wav2Vec2 model by Facebook.

On the surface, the 2020 US Presidential election seems like a wild roller-coaster ride, with each surprising twist and turn of events inducing both panic and dread among voters and observers alike.
In comparison, the polls and forecasts in the run up to the vote on November 3 have kept an almost Zen-like calm. Two months-long forecasts by FiveThirtyEight and weekly magazine The Economist point to an unambiguous win for challenger Joe Biden despite widespread fears of a contested election.
Meanwhile, White House incumbent Donald Trump has seen his chances of re-election decline steadily in the forecasts despite talk of…

With about two weeks to go before the 2020 US Presidential Election, the statistics on multiple fronts are looking rather grim for White House incumbent Donald Trump.
Daily forecasts from data analysis outfit FiveThirtyEight and weekly magazine The Economist point to a resounding defeat for him on November 3. Trump even appears to be underperforming on Twitter upon closer examination of the metrics.
Is it game over for Trump? As FiveThirtyEight’s editor-in-chief Nate Silver has pointed out on numerous occasions, having a low chance is not the same as having no chance of winning. …

Note to readers: The forecasts in this post were completed just as news broke that Donald Trump had tested positive for Covid-19. The impact of this major development won’t be clear for a while, and I’ll update the forecasts as things become clearer.
With about a month to go before the 2020 United States Presidential Election on November 3, all eyes are on the barrage of polls and forecasts for the highly volatile race for the White House. …

With the 2020 US election around the corner, concerns about electoral interference by state actors via social media and other online means are back in the spotlight in a big way.
Twitter was a major platform that Russia used to interfere with the 2016 US election, and few have doubts that Moscow, Beijing and others will turn to the platform yet again with new disinformation campaigns.
This post will outline a broad overview of of how you can build a state troll tweets detector by fine tuning a transformer model (Distilbert) with a custom dataset. …

*UPDATED Dec 30, 2020*:
Facebook recently released recently released its machine translation models for English to Tamil (and vice versa), and I was eager to give it a try since Tamil is among the most under-served languages in machine learning, and related language pairs are pretty hard to come by.
The new notebooks and toy datasets are in the repo. Or, go here for the demo for English-to-Tamil translation of speeches and news articles, and here for Tamil-to-English translation of the same type of material.
There are obvious problems with the quality of the translation in some parts. But machine…

Auto-text generation is undoubtedly one of the most exciting fields in NLP in recent years. But it’s also an area that’s relatively difficult for newcomers to navigate, due to the high bar for technical knowledge and resource requirements.
While there’s no shortage of helpful notebooks and tutorials out there, pulling the various threads together can be time consuming. To help speed up the learning process for fellow newcomers, I’ve put together a simple end-to-end project to create a simple AI conversational chatbot that you can run in an interactive app.
I chose to frame the text generation project around a…

Data Science | Media | Politics