HuggingFace Transformer Pipeline : Sentiment Analysis

In this notebook, we will use the pre-trained HuggingFace's Transformer Pipeline : Sentiment Analysis to predict the sentiment of the tweets.

We will compare this pre-trained local model to the baseline model from 1_baseline.ipynb.

Load project modules and data

We will use basic python packages, and the HuggingFace package to predict text sentiment.

Classification Model

Now we can measure the performance of our model defined in custom_huggingface_sentiment_analysis_classifier.py. We are going to use the same metrics as our baseline model defined in 1_baseline.ipynb.

HuggingFace's Transformer Pipeline : Sentiment Analysis model

In this model, we will use the pre-trained HuggingFace's Transformer Pipeline Sentiment Analysis to predict the sentiment of the tweets.

The performances on the dataset are quite better than our baseline model :

Unlike our baseline model, this model is biased towards the NEGATIVE class. It is slightly more biased than our baseline model : it predicted 43% (baseline = 35% , +23%) more NEGATIVE (1176) messages than POSITIVE (824).

Let's observe some classification errors.

On this false-positive example, we can see that the model is not able to predict the NEGATIVE sentiment of the message. It predicts a strong POSITIVE sentiment even though the text is supposed to be NEGATIVE, which is actually not obvious even for a human (the person is not getting operated, is releasing an album and is going to have lunch with his dad for father's day)...

On this false-negative example, we can see that the model is not able to predict the POSITIVE sentiment of the message. But again in this case, the true sentiment is not obvious (the person is not happy to go to work).