Text Analytics - Sentiment Classification on Amazon Product Reviews

NLP | Supervised Learning | Customer Feedback Analysis


Tools & Techniques: Python, Pandas, Sklearn, NLTK, TF-IDF Vectorizer, Logistic Regression, Confusion Matrix, Classification Report

Overview:
In the world of e-commerce, reviews carry more weight than ads. Customer reviews are one of the most powerful tools that shape e-commerce decision-making. I wanted to understand how natural language processing could help businesses extract insight from that unstructured feedback. So, I built a sentiment classification model using real Amazon product reviews, turning raw text into structured, decision-ready insight, applying end-to-end text analytics and machine learning in Python

Approach:
-Explored a dataset of Amazon reviews labelled as positive or negative
-Preprocessed text (lowercasing, removing stop words/punctuation, stemming)
-Converted text to vectors using TF-IDF to capture important terms
-Trained a Logistic Regression model for sentiment classification
-Split data into train/test sets and evaluated using accuracy, precision, and recall to ensure balanced predictions

Outcome:

  • Achieved 88% accuracy in predicting sentiment on test data
  • Strong F1 scores across both classes
  • Demonstrated consistent performance in handling noisy, user-generated text

Business Application:

E-commerce platforms like Amazon can use this type of model to:

  • Automatically flag negative reviews for product teams
  • Power dashboards for category managers (e.g. % positive reviews per SKU)
  • Drive personalisation by mapping sentiment trends to user preferences