Natural Language Processing2024

Multi-Class NLP Text Classifier: Sentiment and Topic Analysis

TF-IDF vectorisation and fine-tuned BERT model trained on 50,000 customer reviews. 94.1% accuracy across 6 sentiment categories. Production-ready pipeline with model serialisation and batch inference.

Project Summary

Customer reviews from an e-commerce platform (50,000 reviews, 6 sentiment categories: Very Positive, Positive, Neutral, Negative, Very Negative, and Irrelevant). Goal: classify new reviews automatically to route them to the appropriate customer service team without manual reading.

Two-Model Approach

Model 1: TF-IDF + Logistic Regression (Baseline)

TF-IDF vectorisation with bigrams, L2-regularised logistic regression. 87.3% accuracy. 0.86 macro F1-score. Training time: 8 seconds. Suitable for production deployment with low latency requirement.

Model 2: Fine-Tuned BERT (Selected)

bert-base-uncased fine-tuned for 3 epochs on the training set. 94.1% accuracy. 0.93 macro F1-score. Training time: 42 minutes on GPU. Selected for production due to superior performance on edge cases and context-dependent sentiment.

Results by Category

CategoryPrecisionRecallF1
Very Positive0.960.970.97
Positive0.940.930.94
Neutral0.890.880.89
Negative0.930.940.94
Very Negative0.950.950.95
BERTHuggingFace TransformersTF-IDF spaCyPyTorchscikit-learn
View on GitHub Request NLP Project