Artificial Intelligence Project Based Learning

Handwriting Recognition and Translation

This project allows you to solidify your Deep Learning skills in computer vision and machine translation. As you will develop deep learning models for building an OCR (Optical character recognition) application for detecting text in images and then translate that text into the Hindi language, you will work with Tensorflow(Keras) and Hugging Face, etc. libraries. Then at the end, you will develop a Flask application to showcase the best model and make predictions on new data.

When completing this project, you will have a working OCR application showpiece that impresses potential employers.

13.5 Hours

Intermediate

Tools Covered

Project Structure

The complete capstone project is divided into 4 projects. Each of these projects have multiple tasks. For each
of the task, there are theory videos to understand the concepts and solution video along with dataset and code.

01

Data Exploration and Building Basic Models for Text Recognition

Understand the Structure of Data and Build Basic OCR Models Using CRNN Architecture

See Tasks
02

Improving Text Recognition Capabilities Using Transformer

Build OCR Models to Improve Text Detection and Recognition Using State-of-the-art Transformer Models

See Tasks
03

Building Machine Translation Models for English-Hindi

Translate Extracted Text (which we got from the OCR) from English to Hindi

See Tasks
04

Integrate the Deep Learning Models in the Web Application

Deploying Handwriting Recognition & Translation with Best Models and Make Predictions on the Unseen Image

See Tasks

FAQs

Project-based learning enables you to learn Job-Ready Tech Skills by Building Real Software Projects. These projects cover multiple concepts end-to-end to help you gain complete expertise not only from theory perspective but more from hands-on perspective.

In this project we will learn:

1. How to build of offline OCR system.

2. How to build single-line text detection OCR for English handwriting text.

3. You will be using IAM Handwriting dataset to build OCR models.

4. How to build a Machine translation system for translating English to Hindi text.

5. You'll use algorithms such as LSTM, Bi-directional LSTM, and Sate-of-the-art models such as IndicTrans.

6. You'll learn how to use the attention mechanism for machine translation in encoder-decoder models.

7. Understand the use and architecture of models such as CRNN, Encoder-Decoder, and Transformer-based models such as TrOCR for OCR.

8. You'll learn about handling deployment with node.js and Flask API to host the model.

9. You'll learn to use tools such as Tensorflow/Keras and HuggingFace (transformer) API for building models.

10. You'll learn techniques such as Transfer learning to fine-tune the deep learning models.

In traditional learning, more focus is on theory whereas in project-based learning more focus is on the hands-on. Project-based learning provides more close to real time experience.

You will be getting task-wise all the supportive theory videos so understanding theory is not going to be any issue.

 

It helps to build your portfolio along with giving you necessary hands-on exposure to how to work on a project in a real environment. You can add the projects in your portfolio. More often in an interview, the questions are asked from the projects which you have done, so if you have done the project completely, qualifying interview shall be comparatively easier.

Explore and Understand the
IAM(words)

Explore and Understand the IAM(lines)
Datasets

Explore and Understand the English-Hindi
Parallel Corpus for Translation

Create a Base Offline OCR Model Using
CRNN for IAM(words)

Create a Base Offline OCR Model Using
CRNN for IAM(lines)

Create an HTR (Handwriting Text
Recognition)Model Using Beam Search
(SimpleHTR)

Make Inference on TrOCR (transformer)
Model TASK on the Test Images using
the HuggingFace Transformer Library

Fine Tune TrOCR Model on the IAM
Dataset for Lines Set

Make Inference on TrOCR IAM
(fine-tuned) Model on the Test Images

Create a Base Encoder-Decoder Machine
Translation Model with LSTM

Create a Machine Translation Model with
Luong-Style Attention

Create a Bi-directional enc-dec Model
with an Attention Mechanism

Develop a Transformer-based Machine
Translation Model

Setup IndicTrans Model and
Dependencies

Integrate Flask API (model) in the
Node.js Application

Test Application with an Unseen Image