What will you Learn in the Project?
- Making inference using CRNN model with beam search decoding
- Understand HuggingFace model and processor module
- Inferencing on TrOCR model using HuggingFace libraries
- Fine-tuning the OCR model on the IAM dataset
Tools & Technologies Used
- Google Colab
- HuggingFace (transformers, datasets)
- PyTorch (Dataset)
- Working knowledge of tools such as Tensorflow, Huggingface (transformer, datasets), library
- Understanding of Dataset module of Pytorch library
- Good theoretical understanding of concepts related to Transformer [Encoder-Decoder] architecture and text generation concepts such as Beam Search.
- Understanding of TrOCR model architecture (which is the Transformer based OCR model).
Task-1: Create an HTR (Handwriting Text Recognition) model using Beam Search (SimpleHTR)
Task-2: Make inference on TrOCR (transformer) model on the test images using the HuggingFace transformer library.
Task-3: FineTune TrOCR model on the IAM dataset for lines set
Task-4: Make inference on TrOCR IAM (fine-tuned) model on the test images