Business Card Scanner API

A Flask-based REST API that extracts and structures information from business card images using OCR and AI.

Features

Supports multiple OCR engines:
- Google Cloud Vision API
- Tesseract OCR
Text structuring using OpenAI GPT
JSON output format
File upload validation
Secure file handling
Configurable via environment variables

Tech Stack

Python 3.x
Flask 2.0.1
Google Cloud Vision API
OpenAI GPT API
Tesseract OCR
Additional dependencies in requirements.txt

Prerequisites

Python 3.10 or higher
Tesseract OCR installed on your system
Google Cloud account
OpenAI account
Git (optional)

Installation

Clone the repository (or download the source code):

git clone https://github.com/The-Lone-Druid/cardscannerpoc.git
cd cardscannerpoc

Create and activate a virtual environment:

# Windows
python -m venv venv
source venv/Scripts/activate  # Git Bash
# or
.\venv\Scripts\activate.ps1   # PowerShell
# or
venv\Scripts\activate.bat     # Command Prompt

# macOS/Linux
python3 -m venv venv
source venv/bin/activate

Install required packages:

pip install -r requirements.txt

API Key Setup

1. Google Cloud Vision API

Create a Google Cloud account: https://cloud.google.com/
Create a new project in Google Cloud Console
Enable the Cloud Vision API:
- Go to "APIs & Services" > "Library"
- Search for "Cloud Vision API"
- Click "Enable"
- Make sure to add a billing account to the project or else the API will not work, follow the instructions here
Create credentials:
- Go to "APIs & Services" > "Credentials"
- Click "Create Credentials" > "Service Account"
- Fill in service account details
- Select role: "Project" > "Owner"
- Click "Create and Continue"
Download JSON credentials:
- Click on your service account
- Go to "Keys" tab
- Click "Add Key" > "Create New Key"
- Choose JSON format
- Save the file in your project's credentials folder

2. OpenAI API

Create an OpenAI account: https://platform.openai.com/signup
Get your API key:
- Go to https://platform.openai.com/api-keys
- Click "Create new secret key"
- Copy the generated key
- This also requires a billing account, follow the instructions here

3. Tesseract OCR

Windows:
- Download installer from: https://github.com/UB-Mannheim/tesseract/wiki
- Install and note the installation path
- Update tesseract_cmd path in utils/tesseract_helper.py
macOS:
```
brew install tesseract
```
Linux:
```
sudo apt-get install tesseract-ocr
```

Configuration

Create a .env file in the project root:

FLASK_SECRET_KEY=your-secret-key-here
GOOGLE_APPLICATION_CREDENTIALS=./credentials/your-credentials-file.json
OPENAI_API_KEY=your-openai-api-key

Create required directories:

mkdir -p credentials
mkdir -p scans/generated

Move your Google Cloud credentials JSON file to the credentials folder

Test your setup

python test_setup.py

Once you have verified that the API keys are working, you can start the application.

Running the Application

Ensure your virtual environment is activated
Start the Flask server:
```
python app.py
```
Access the application at: http://127.0.0.1:5000

Usage

Open the web interface in your browser
Select the OCR engine (Google Vision or Tesseract)
Upload a business card image
Click "Scan Card"
View the extracted information in both raw and structured JSON format

Project Structure

cardscannerpoc/
├── __pycache__/
├── credentials/
├── scans/
│   └── generated/
├── static/
│   ├── css/
│   │   └── style.css
│   └── js/
│       └── main.js
├── templates/
│   └── index.html
├── utils/
│   ├── __pycache__/
│   ├── __init__.py
│   ├── vision_helper.py
│   ├── tesseract_helper.py
│   └── gpt_helper.py
├── .env
├── .gitignore
├── app.py
├── config.py
├── README.md
├── requirements.txt
├── test_setup.py
└── TODOS.md

Testing

Test your API connections:

python test_setup.py

Error Handling

The application includes error handling for:

Invalid file types
Failed text extraction
API connection issues
Processing errors

Security Notes

Never commit sensitive files:
- .env
- API credentials
- Uploaded images
- Generated JSON files
The .gitignore file is configured to exclude:
- Sensitive files
- Virtual environment
- Python cache files
- Uploaded and generated files

TODOs

Integration with Deepseek R1 Model
Cost estimation for various APIs
Project documentation enhancement
Microservice implementation with Node.js integration

Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

Commitizen

This project uses Commitizen for commit messages. To commit, you'll have to install commitizen and setup husky:

Install dependencies:
```
npm install
```
Initialize husky:
```
npx husky install
```
Commit using Commitizen:
```
 git commit
 # or
 npm run commit
```

License

This project is licensed under the MIT License - see the LICENSE file for details

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Business Card Scanner API

Features

Tech Stack

Prerequisites

Installation

API Key Setup

1. Google Cloud Vision API

2. OpenAI API

3. Tesseract OCR

Configuration

Test your setup

Running the Application

Usage

Project Structure

Testing

Error Handling

Security Notes

TODOs

Contributing

Commitizen

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.husky		.husky
static		static
templates		templates
utils		utils
.czrc		.czrc
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
TODOS.md		TODOS.md
app.py		app.py
commitlint.config.js		commitlint.config.js
config.py		config.py
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
requirements.txt		requirements.txt
test_setup.py		test_setup.py

License

The-Lone-Druid/cardscannerpoc

Folders and files

Latest commit

History

Repository files navigation

Business Card Scanner API

Features

Tech Stack

Prerequisites

Installation

API Key Setup

1. Google Cloud Vision API

2. OpenAI API

3. Tesseract OCR

Configuration

Test your setup

Running the Application

Usage

Project Structure

Testing

Error Handling

Security Notes

TODOs

Contributing

Commitizen

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages