Skip to content

JohnSnowLabs/spark-ocr-workshop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spark OCR Workshop

Public examples of using John Snow Labs' OCR for Apache Spark.

Installing and Quickstart Instructions

You need Secret key and License key. Please contact us at info@johnsnowlabs.com for get it.

Secret key is a key for download or install python package and jar from https://pypi.johnsnowlabs.com/. Secret key is specific for each release.

License key is a key for run Spark OCR.

Setting up the license key

Setting up in notebook

Need set value for license variable:

license = "license key"

Using Environment variable

Linux, Mac OS:

export JSL_OCR_LICENSE=license_key

Windows:

set JSL_OCR_LICENSE=license_key

Run notebooks on Google Colab

  • Go to the https://colab.research.google.com/
  • Open notebook:
  • Set secret and license variables to valid values in first cell.
  • Run all cells: Runtime -> Run all.
  • Restart runtime: Runtime -> Restart runtime (Need restart first time after installing new packages).
  • Run all cellls again.

Run notebooks locally using jupyter

python3 -m pip install --user jupyter
  • Clone Spark OCR workshop repo:
git clone https://github.com/JohnSnowLabs/spark-ocr-workshop
cd spark-ocr-workshop
  • Run jupyter:
jupyter-notebook
  • Open jupyter in browser
  • Open jupyter/SparkOcrSimpleExample.ipynb notebook.
  • Set secret and license variables to valid values in first cell.
  • Run all cells: Cell -> Run all.
  • Restart runtime: Kernel -> Restart (Need restart first time after installing new packages).
  • Run all cellls again.

About

Public runnable examples of using John Snow Labs' OCR for Apache Spark.

Resources

Stars

Watchers

Forks

Packages

No packages published