This project can be used to classify reviews as positive or negative and can be applied to any movie review dataset. This project provided hands-on skills in deep learning concepts such as RNN and LSTM for building sentiment review text classification projects using TensorFlow Keras.
We’ll be using a Long Short-Term Memory (LSTM) network, which is a type of recurrent neural network (RNN). LSTM networks are well-suited for time series data, which is what we’ll be using them for. The dataset we’ll be using is the Rotten Tomatoes movie review dataset. This dataset contains 5,331 positive and 5,331 negative reviews, and we’ll be using an 80-20 split for training and testing. We’ll be using the Keras library to build our LSTM network. Keras is a high-level API for building deep-learning models. It’s simple to use and can be run on top of TensorFlow, Theano, or Microsoft Cognitive Toolkit (CNTK).
Our model will take in a review (a sequence of words) and output a sentiment score. A sentiment score of 0.5 or greater will be classified as a positive review, and a sentiment score of less than 0.5 will be classified as a negative review.
What will you Learn in the Project Sentiment Review classification project?
- Reading data from different text files and loading it into the dataframe
- Convert text data into vector format for training the deep learning models
- Build and train sequence-based model
- How to improve on the simple RNN-based model
Prerequisite
- Working knowledge of Keras library
- Theoretical understanding of sequence-based models i.e. RNN, LSTM, GRU
Tools Used
- Google colab [Jupyter notebook] for model building
- nltk library
- Keras library for implementing sequence models
Tasks Performed
We will be performing the following tasks as part of this project:
Task-1: Import the various libraries and load the dataset into dataframe
Task-2: Convert text into numerical form for model building.
Task-3: Build the base RNN model for training.
Task-4: Train your model using split train and test data.
Task-5: Build an LSTM model and evaluate it on the test set.
Task-6: Build a Bi-directional LSTM model and evaluate on the test set.
Task-7: Build a GRU model and evaluate it on the test set.
Task-8: Compare the performance of the above models on a test set and state the best one.