Artificial Intelligence Project Based Learning

Handwriting Recognition and Translation

This project allows you to solidify your Deep Learning skills in computer vision and machine translation. As you will develop deep learning models for building an OCR (Optical character recognition) application for detecting text in images and then translate that text into the Hindi language, you will work with Tensorflow(Keras) and Hugging Face, etc. libraries. Then at the end, you will develop a Flask application to showcase the best model and make predictions on new data.

When completing this project, you will have a working OCR application showpiece that impresses potential employers.

13.5 Hours


Tools Covered

Project Structure

The complete capstone project is divided into 4 projects. Each of these projects have multiple tasks. For each
of the task, there are theory videos to understand the concepts and solution video along with dataset and code.


Data Exploration and Building Basic Models for Text Recognition

Understand the Structure of Data and Build Basic OCR Models Using CRNN Architecture

See Tasks

Improving Text Recognition Capabilities Using Transformer

Build OCR Models to Improve Text Detection and Recognition Using State-of-the-art Transformer Models

See Tasks

Building Machine Translation Models for English-Hindi

Translate Extracted Text (which we got from the OCR) from English to Hindi

See Tasks

Integrate the Deep Learning Models in the Web Application

Deploying Handwriting Recognition & Translation with Best Models and Make Predictions on the Unseen Image

See Tasks


Explore and Understand the

Explore and Understand the IAM(lines)

Explore and Understand the English-Hindi
Parallel Corpus for Translation

Create a Base Offline OCR Model Using
CRNN for IAM(words)

Create a Base Offline OCR Model Using
CRNN for IAM(lines)

Create an HTR (Handwriting Text
Recognition)Model Using Beam Search

Make Inference on TrOCR (transformer)
Model TASK on the Test Images using
the HuggingFace Transformer Library

Fine Tune TrOCR Model on the IAM
Dataset for Lines Set

Make Inference on TrOCR IAM
(fine-tuned) Model on the Test Images

Create a Base Encoder-Decoder Machine
Translation Model with LSTM

Create a Machine Translation Model with
Luong-Style Attention

Create a Bi-directional enc-dec Model
with an Attention Mechanism

Develop a Transformer-based Machine
Translation Model

Setup IndicTrans Model and

Integrate Flask API (model) in the
Node.js Application

Test Application with an Unseen Image