TF-IDF vectorisation and fine-tuned BERT model trained on 50,000 customer reviews. 94.1% accuracy across 6 sentiment categories. Production-ready pipeline with model serialisation and batch inference.
Customer reviews from an e-commerce platform (50,000 reviews, 6 sentiment categories: Very Positive, Positive, Neutral, Negative, Very Negative, and Irrelevant). Goal: classify new reviews automatically to route them to the appropriate customer service team without manual reading.
TF-IDF vectorisation with bigrams, L2-regularised logistic regression. 87.3% accuracy. 0.86 macro F1-score. Training time: 8 seconds. Suitable for production deployment with low latency requirement.
bert-base-uncased fine-tuned for 3 epochs on the training set. 94.1% accuracy. 0.93 macro F1-score. Training time: 42 minutes on GPU. Selected for production due to superior performance on edge cases and context-dependent sentiment.
| Category | Precision | Recall | F1 |
|---|---|---|---|
| Very Positive | 0.96 | 0.97 | 0.97 |
| Positive | 0.94 | 0.93 | 0.94 |
| Neutral | 0.89 | 0.88 | 0.89 |
| Negative | 0.93 | 0.94 | 0.94 |
| Very Negative | 0.95 | 0.95 | 0.95 |