The addition of the Wav2Vec2 model in Hugging Face’s transformers library has been one of the more exciting developments in NLP in recent months. Until then, it wasn’t easy to execute tasks like machine translation or sentiment analysis if you only had a long audio clip to work with.
But now you can link up an interesting combination of NLP tasks in one go: transcribe the audio clip with Wav2Vec2, and then use a variety of transformer models to summarize or translate the transcript. …
*UPDATED Dec 30, 2020*:
Facebook recently released recently released its machine translation models for English to Tamil (and vice versa), and I was eager to give it a try since Tamil is among the most under-served languages in machine learning, and related language pairs are pretty hard to come by.
The new notebooks and toy datasets are in the repo. Or, go here for the demo for English-to-Tamil translation of speeches and news articles, and here for Tamil-to-English translation of the same type of material.
There are obvious problems with the quality of the translation in some parts. But machine…
On the surface, the 2020 US Presidential election seems like a wild roller-coaster ride, with each surprising twist and turn of events inducing both panic and dread among voters and observers alike.
In comparison, the polls and forecasts in the run up to the vote on November 3 have kept an almost Zen-like calm. Two months-long forecasts by FiveThirtyEight and weekly magazine The Economist point to an unambiguous win for challenger Joe Biden despite widespread fears of a contested election.
Meanwhile, White House incumbent Donald Trump has seen his chances of re-election decline steadily in the forecasts despite talk of…
With about two weeks to go before the 2020 US Presidential Election, the statistics on multiple fronts are looking rather grim for White House incumbent Donald Trump.
Daily forecasts from data analysis outfit FiveThirtyEight and weekly magazine The Economist point to a resounding defeat for him on November 3. Trump even appears to be underperforming on Twitter upon closer examination of the metrics.
Is it game over for Trump? As FiveThirtyEight’s editor-in-chief Nate Silver has pointed out on numerous occasions, having a low chance is not the same as having no chance of winning. …
AI can’t write great poetry on its own (yet?). But it can now transcribe a poetry recital really well, if results from the Wav2Vec2 transformer model is anything to go by.
My trials using audio clips ranging in length from 62s to 12.5 minutes, including the evocative Inaugural Poem by youth poet Amanda Gorman, turned up pretty impressive results.
Efficient audio-to-text transcription has been one of the “missing links” in the modern Natural Language Processing (NLP) toolkit. Not anymore it seems, thanks to Hugging Face’s implementation of the Wav2Vec2 model by Facebook.
Note to readers: The forecasts in this post were completed just as news broke that Donald Trump had tested positive for Covid-19. The impact of this major development won’t be clear for a while, and I’ll update the forecasts as things become clearer.
With about a month to go before the 2020 United States Presidential Election on November 3, all eyes are on the barrage of polls and forecasts for the highly volatile race for the White House. …
With the 2020 US election around the corner, concerns about electoral interference by state actors via social media and other online means are back in the spotlight in a big way.
Twitter was a major platform that Russia used to interfere with the 2016 US election, and few have doubts that Moscow, Beijing and others will turn to the platform yet again with new disinformation campaigns.
This post will outline a broad overview of of how you can build a state troll tweets detector by fine tuning a transformer model (Distilbert) with a custom dataset. …
Auto-text generation is undoubtedly one of the most exciting fields in NLP in recent years. But it’s also an area that’s relatively difficult for newcomers to navigate, due to the high bar for technical knowledge and resource requirements.
While there’s no shortage of helpful notebooks and tutorials out there, pulling the various threads together can be time consuming. To help speed up the learning process for fellow newcomers, I’ve put together a simple end-to-end project to create a simple AI conversational chatbot that you can run in an interactive app.
I chose to frame the text generation project around a…
Summarising a speech is more art than science, some might argue. But recent advances in NLP could well test the validity of that argument.
In particular, Hugging Face’s (HF) transformers summarisation pipeline has made the task easier, faster and more efficient to execute. Admittedly, there’s still a hit-and-miss quality to current results. But there are also flashes of brilliance that hint at the possibilities to come as language models become more sophisticated.
This post will demonstrate how you can easily use HF’s pipeline to summarise both short and long speeches. A minor work-around is needed for long speeches due to…
Data Science | Media | Politics