Detecting Clickbaits (2/4) - Universal-Sentence-Encoder Transfer Learning
Problem.
Given a set of 32000 headlines and their labels, whether that headline is a clickbait (label 1) or
not (label 0), you’re asked to build a model to detect clickbait headlines.
Solution.
Read data:
Split into train/validation/test sets:
Load Universal Sentence Encoder pre-trained network and its weights from
tensorflow hub, set the weights as trainable (trainable=True),
and add a final output layer with sigmoid activation since it’s a binary
classifier:
Train for 2 epochs:
Then we can measure the precision and recall on our test set:
Important Points.
The training time: 45min on Google Colab (TPUs)
Macro precision on test set: 0.9842
Inference time per record: ~2ms on my laptop (MacBook Pro: 2.3 GHz 8-Core Intel Core i9, 32 GB 2667 MHz DDR4)
Note.
The complete code for this post can be found on GitHub.
It’s recommended to run this notebook in Google Colab.
For other solutions for this problem, please refer to the next posts.