Paid Projects , Project ,

Identify Topics from News Headlines

In this project, you'll learn how to perform the topic modeling using the sklearn library. We will be using a dataset that contains one million plus news headlines from a news agency. Your work is to study the data and use topic modeling techniques to find out the relevant topics in the headlines & compare different techniques.
Course
Curriculum

What will you Learn in the Project?

  1. Exploratory Data Analysis and Preprocessing for Topic Modelling
  2. Latent Semantic Analysis for Topic Modelling 
  3. TSNE [T-distributed Stochastic Neighbor Embedding] 
  4. Latent Dirichlet Allocation for Topic Modelling 
  5. Visualizing the topics in the form of clustering

Tools & Technologies Used

  1. Pandas
  2. Scikit-learn
  3. Bokeh 
  4. Numpy 
  5. Matplotlib 

Tasks Performed

This project is divided into multiple tasks.

  1. Import the necessary libraries and load the dataset
  2. Perform Exploratory Data Analysis for data understanding
  3. Display the top words in the dataset based on the frequency
  4. Preprocess the dataset before topic modeling
  5. Get the top topic using LSA (topic modeling)
  6. Display the embedding of the topics using TSNE
  7. Get the top topics using LDA (topic modeling)
  8. Display the embedding of the topics using TSNE
  9. Compare the results and state the best one
Not Enrolled
or 249₹ 2499
91% off

Skills you will develop

Topic Modelling

Latent Semantic Analysis for Topic Modelling

TSNE [T-distributed Stochastic Neighbor Embedding]

Latent Dirichlet Allocation for Topic Modelling

Visualizing the Topics in the form of Clustering

Share with Friends and earn points!!