Skip to content

docongminh/google-vision-api

Repository files navigation

Use Google Vision API to extract Text Annotations

Description

Here Script use Google Vision api to extract Text Annotations in images.

Requirement

  • Python 3.x
  • Credentials

Setup

To install necessary library, simply use pip:

pip install google-cloud-vision

or,

pip install -r requirements.txt

Next, set up to authenticate with the Cloud Vision API using your project's service account credentials. See the Vision API Client Libraries for more information. Then, set the GOOGLE_APPLICATION_CREDENTIALS environment variable to point to your downloaded service account credentials:

export GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/credentials-key.json

Quick Start: Running script

Text Detection

~ python vision.py --images ./images 

Test Result

~ python test.py --gt ./ --output ./image_test --number_test 2

Convert to Pascal VOC data format

~ python convert_pascal_format.py --output output --input images --gt_path gt.pkl

Result

Words Annotations

Words Annotations

Character Annotations

Characters Annotations

Image Suggestion Resizing

To enable accurate image detection within the Google Cloud Vision API, images should generally be a minimum of 640 x 480 pixels (about 300k pixels). Full details for different types of Vision API Feature requests are shown below:

Vision API Feature Recommended Size Notes
FACE_DETECTION 1600 x 1200 Distance between eyes is most important
LANDMARK_DETECTION 640 x 480
LOGO_DETECTION 640 x 480
LABEL_DETECTION 640 x 480
TEXT_DETECTION 1024 x 768 OCR requires more resolution to detect characters
SAFE_SEARCH_DETECTION 640 x 480

About

Text Annotation Using Google Vision API

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages