Amazon Captcha Solver

A Flask API based solution for tackling captcha when collecting data from Amazon.

This is a Flask API based solution to solve the captcha, by accepting the captcha image from POST request. The API then calls the captcha solving script and sends the solved captcha text in return.

The goal is to solve the captcha images from Amazon. Sample captcha image can be seen below -

How to use :

Run the Flask API script -

python solve_api.py

To check the status of API, call the default method by accessing its IP -

your_api_ip_address:5000    # '5000' being the default port for Flask Server, which hosts the API.

To call the captcha solver function with 'requests' module and passing captcha image as file -

import requests

def captcha_uploader():
    # API URL with a call to function to solve captcha
    captcha_solver_api_url = 'your_api_ip_address:5000/solve'
    # opening the captcha image file as binary and putting it as value for key 'captcha'
    file = {'captcha': open('your_captcha_image_filepath','rb')}
    
    # Calling the API function as a 'POST' request with 'files' parameter
    response = requests.post(captcha_solver_api_url, files=file)
    print("Captcha file uploaded.")

    # Fetching the captcha text from API response.
    try:
        captcha_text = resp.json()['output']
    except:
        print("Response not in JSON format. Please check your API code.")
        captcha_text = "NA"
    
    return captcha_text

How to contribute :

Please start with installing all the required packages from requirements file-

pip install -r requirements.txt

Then to initially run the model on test_captchas, use following command -

python solve_captchas_with_model.py

P.S. Notes :

Regarding model file

The current model file is built after training some 4K training set captcha images. Training can performed on much more larger dataset for better results, but current results aren't bad either 😉.

Regarding API code

The API works fine but can be enhanced further according to use cases. The API's reponses would highly depend upon the training quality of the model file.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
.github/workflows		.github/workflows
amazoncapApi.egg-info		amazoncapApi.egg-info
collect_captchas		collect_captchas
dist		dist
extracted_letter_images		extracted_letter_images
test_captchas		test_captchas
training_dataset_captchas		training_dataset_captchas
LICENSE		LICENSE
README.md		README.md
amz_captcha_model.hdf5		amz_captcha_model.hdf5
amz_captcha_model_labels.dat		amz_captcha_model_labels.dat
extract_letters_from_captchas.py		extract_letters_from_captchas.py
helpers.py		helpers.py
requirements.txt		requirements.txt
setup.py		setup.py
solve_api.py		solve_api.py
solve_captcha_with_model.py		solve_captcha_with_model.py
solve_captchas_with_model.py		solve_captchas_with_model.py
test.jpg		test.jpg
test_captchas_set_results.txt		test_captchas_set_results.txt
train_model.py		train_model.py

License

HRN-Projects/amazon-captcha-solver

Folders and files

Latest commit

History

Repository files navigation

Amazon Captcha Solver

How to use :

How to contribute :

P.S. Notes :

Regarding model file

Regarding API code

About

Topics

Resources

License

Stars

Watchers

Forks

Languages