Skip to content
/ HWGAN Public

HWGAN is a GAN network combined with an OCR component to generate handwritten text. An image processing component has been attached to make it possible to use all types of data.

License

Notifications You must be signed in to change notification settings

Rubinjo/HWGAN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HWGAN: Handwritten Text Generator

Made with Python

This project was executed as a school assignment at the University of Twente. HWGAN is our own implementation of a handwriting generator build with a GAN and OCR neural network. The neural networks in this project have been build with tensorflow.

Project Overview

  • School: University of Twente
  • Course: Machine Learning II
  • Assignment Type: Open Project
  • Group Size: 4

Setup

  1. Use Python 3.6-3.8
  2. Execute the following command to install required packages:
pip install -r ./helper/requirements.txt
  1. For GPU support we recommend to also install CUDA Toolkit 11.0, cuDNN 8.0.4 and NVIDIA GPU Driver 450 or higher (NVIDIA website)(TensorFlow guide)

Usage

  • The training of the OCR and GAN models will by default use the EMNIST ByMerge dataset, follow our DATA_GUIDE for the setup of custom datasets. For training the neural networks use the following command:
python train.py -data emnist
  • For creating a word use the following command (where example_word is the word you want to create):
python run.py -data emnist -text example_word

Available characters: 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_
For more specific options look below at arguments

Arguments

Training (train.py)

  • -data: Specify which dataset to use (default = emnist)
  • -sample: Gives an example of how data is splitted. A number is given to indicate how many example images you want to retrieve (default = 0)
  • -text: Specify how you want the data to be splitted, options: chars, words, lines (default = chars)
    • chars is single character images
    • words is images of multiple words but only on a single line
    • lines is multiple words and lines in a single image
    • Currently it is not possible to have a combination of these options, so your dataset needs to adhere to one of these options
  • -ocr: Specify if you want to train the OCR (recognizer) model, options: True, False (default = False)

Create Handwriting (run.py)

  • -data: Specify which dataset to use (default = emnist)
  • -text: Specify the word you want to create (e.g. = example_word)

File Structure

 dataset
     ├── DATA_GUIDE.md                      # How to add custom datasets
     ├── ...                                # Location to add custom datasets
 helper
     ├── coversion.py                       # Custom images are split into characters
     ├── requirements.txt                   # Configuration file with all dependencies to install
     ├── split_data.py                      # Split the dataset into letter specific data
     ├── userinput.py                       # Handle user arguments
 models
     ├── gan_model/                         # Holds all files related to the GAN model
         ├── gifs/
             ├── ...                        # .gif files of the training process
         ├── graphs/
             ├── ...                        # Loss graphs of the training process
         ├── saved_models/
             ├── ...                        # Trained discriminator and generator models
         ├── GAN.py                         # GAN model
     ├── ocr_model/                         # Holds all files related to the OCR model
         ├── ocr_model.h5                   # Trained OCR model
         ├── OCR.py                         # OCR model
         ├── ...                            # Stats on the performance
 out
     ├── ...                                # Output of the run.py executable
 run.py                                     # Main executable - Generate given word
 train.py                                   # Main executable - Train all models
 ...                                        # Extra project management files

Acknowledgments

The neural network setup has already been executed by ScrabbleGAN, which is a more elaborate implementation of this principal with pytoch.

About

HWGAN is a GAN network combined with an OCR component to generate handwritten text. An image processing component has been attached to make it possible to use all types of data.

Topics

Resources

License

Stars

Watchers

Forks

Languages