Automate Identification and Recognition of Handwritten Text from an Image

This repository contains the implementation of a CRNN (Convolutional Recurrent Neural Network) model designed to detect and recognize handwritten text from images. The CRNN combines convolutional layers for feature extraction with recurrent layers for sequence modeling, making it well-suited for tasks involving sequential data like text.

Introduction

Handwritten text recognition is a challenging task due to the variability in handwriting styles, orientations, and the presence of noise in images. This project leverages a CRNN architecture to accurately detect and recognize handwritten text from images, making it suitable for applications such as digitizing handwritten documents, automated form processing, and more.

Features

End-to-end text detection and recognition: Automatically detect and recognize text from input images.
CRNN architecture: Combines CNNs for feature extraction with RNNs for sequence modeling.
CTC loss function: Utilizes Connectionist Temporal Classification (CTC) for sequence prediction without pre-segmented labels.
Preprocessing utilities: Includes image preprocessing and augmentation utilities.

Installation

Clone the repository:

git clone https://github.com/VMD7/Automate-identification-and-recognition-of-handwritten-text-from-an-image

Install the required dependencies:
```
pip install -r requirements.txt
```

Model Architecture

The CRNN model architecture consists of the following components:

Convolutional Layers: Extract spatial features from input images.
Recurrent Layers (Bidirectional LSTM): Model the sequential nature of the text.
CTC Loss Function: Handles the alignment between predicted and actual sequences.

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 32, 128, 1)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 32, 128, 64)       640       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 16, 64, 64)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 16, 64, 128)       73856     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 8, 32, 128)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 8, 32, 256)        295168    
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 8, 32, 256)        590080    
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 4, 32, 256)        0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 4, 32, 512)        1180160   
_________________________________________________________________
batch_normalization_1 (Batch (None, 4, 32, 512)        2048      
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 4, 32, 512)        2359808   
_________________________________________________________________
batch_normalization_2 (Batch (None, 4, 32, 512)        2048      
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 2, 32, 512)        0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 1, 31, 512)        1049088   
_________________________________________________________________
lambda_1 (Lambda)            (None, 31, 512)           0         
_________________________________________________________________
bidirectional_1 (Bidirection (None, 31, 512)           1574912   
_________________________________________________________________
bidirectional_2 (Bidirection (None, 31, 512)           1574912   
_________________________________________________________________
dense_1 (Dense)              (None, 31, 79)            40527     
=================================================================
Total params: 8,743,247
Trainable params: 8,741,199
Non-trainable params: 2,048
_________________________________________________________________

Results

After training the CRNN model on the IAM Handwriting Database, the following results were achieved:

Jaro Distance: 0.91
Ratio: 0.88

Demo

You can try out the model live on Hugging Face Spaces: https://huggingface.co/spaces/vjdevane/htr

Contributing

Contributions are welcome! If you'd like to contribute, please fork the repository and create a pull request with your changes. Ensure your code adheres to the existing style and includes appropriate tests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
App		App
HTR_Models		HTR_Models
HTR_Using_CRNN		HTR_Using_CRNN
Images		Images
TestImages		TestImages
HTR_USING_CRNN.ipynb		HTR_USING_CRNN.ipynb
HTR_USING_CRNN_IN_GOOGLE_COLAB.ipynb		HTR_USING_CRNN_IN_GOOGLE_COLAB.ipynb
LICENSE		LICENSE
Project Report.pdf		Project Report.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

App

App

HTR_Models

HTR_Models

HTR_Using_CRNN

HTR_Using_CRNN

Images

Images

TestImages

TestImages

HTR_USING_CRNN.ipynb

HTR_USING_CRNN.ipynb

HTR_USING_CRNN_IN_GOOGLE_COLAB.ipynb

HTR_USING_CRNN_IN_GOOGLE_COLAB.ipynb

LICENSE

LICENSE

Project Report.pdf

Project Report.pdf

README.md

README.md

Repository files navigation

Automate Identification and Recognition of Handwritten Text from an Image

Introduction

Features

Installation

Model Architecture

Results

Demo

Contributing

License

About

Releases

Packages

Languages

License

VMD7/Automate-identification-and-recognition-of-handwritten-text-from-an-image

Folders and files

Latest commit

History

Repository files navigation

Automate Identification and Recognition of Handwritten Text from an Image

Introduction

Features

Installation

Model Architecture

Results

Demo

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages