Project ,

Movie Review Classification

In this project, we’ll learn to classify the text using machine learning approaches. In this project, We’ll be classifying reviews of the IMDB dataset into two different classes i.e. positive or negative. We’ll be learning techniques such as TF-IDF and bag of words. 

Technology has come a long way in recent years, and the Text Classification project using machine learning is one of the most impressive advancements. This technology is now being used in a variety of fields, including movie reviews.

In the past, movie review classification was a tedious and time-consuming task. However, machine learning has made it possible to automatically classify reviews with incredible accuracy.

This text classification project uses machine learning to automatically classify movie reviews as positive or negative. The dataset used is the Rotten Tomatoes movie review dataset, which contains 50,000 reviews.

The results of this project are quite impressive, with the model achieving an accuracy of 87%. This demonstrates the power of machine learning for text classification tasks. Text classification is a task that can be performe using machine learning algorithms. There are a variety of different algorithms that can be used for this task, including support vector machines, decision trees, and naive Bayes.

What will you Learn in the Text Classification project using machine learning ?

  • Cleaning and processing the text data
  • Performing feature extraction
  • Calculate tf-IDF and word of bags for vectorizing data
  • Build machine-learning text classifier models.


  1. Working knowledge of the scikit-learn library
  2. Theoretical understanding of different machine learning classification algorithms
  3. Understanding of TF-IDF and Bag of words approach.

Tools Used

  1. Seaborn and matplolib for plotting the graphs
  2. Nltk for text pre-processing
  3. Sklearn for building machine learning models
  4. Textblob for word-cloud

Tasks to be Performed

We will be performing the following tasks as part of this project:

Task-1: Analysis of the data should be conduct after the data has been load.

Task 2: Clean the data by removing HTML strips, noise text & special characters

Task-3: The text derived from the text data should be execute.

Task 4: Data was cleane by removing stopwords from the text.

Task-5: Compute statistical features using tf-IDF and bag of words technique

Task-6: Build the machine learning models for classifying the reviews

Task 7: Evaluate the model on the test and state the best one

+5 enrolled
Not Enrolled
or 99₹ 999
91% off

Skills you will develop

Processing the text data

Feature extraction through tf-idf and bag of words

Building machine learning models for text classification

Plotting word cloud for representing the text data

Share with Friends and earn points!!