Skip to content

Latest commit

 

History

History

fraud-detection-python

Document AI Fraud Detection Demo with Enterprise Knowledge Graph

Objective

Learn how to use Google Cloud Platform to process and enrich invoices so that we can enable fraud detection.

Architecture

Architecture Diagram

Google Cloud Products Used

Steps to re-create this demo in your own GCP environment

  1. Create a Google Cloud Platform Project

  2. Install and setup the gcloud SDK & CLI

  3. Enable the APIs in the project you created in step #1 above

    • Cloud Document AI API
    • Cloud Functions API
    • Geocoding API
    • Cloud Build API
# Replace with Your Project ID
gcloud config set project YOUR_PROJECT_ID

gcloud services enable documentai.googleapis.com

gcloud services enable cloudfunctions.googleapis.com

gcloud services enable geocoding-backend.googleapis.com

gcloud services enable cloudbuild.googleapis.com
  1. Initialize repositorysitory

    • Activate your Command Shell and clone this GitHub repository in your Command shell using the command:

      git clone https://github.com/GoogleCloudPlatform/documentai-fraud-detection-demo.git
    • Change Directory to the repository Folder

      cd documentai-fraud-detection-demo
  2. Manage API Key

  3. Create your Doc AI processor

    • Go to Console > Doc AI > Create Processor > Invoice Parser (Under Specilaized)
      • Name the processor fraud-detection-invoice-parser (or something else you'll remember)
      • Note the Region and ID of the processor, you will need to plug these values in your cloud function's environment variables
    • Paste the processor location and ID in the process-invoices/.env.yaml file
  4. Execute Bash shell scripts in your Cloud Shell terminal to create cloud resources (i.e Google Cloud Storage Buckets, Pub/Sub topics, Cloud Functions, BigQuery tables)

    1. Update the value of PROJECT_ID in .env.local to match your current projectID

    2. Execute your .sh files to create cloud resources

      bash create-archive-bucket.sh
      bash create-input-bucket.sh
      bash create-output-bucket.sh
      bash create-pub-sub-topic.sh
      bash create-bq-tables.sh
      bash deploy-cloud-function-process-invoices.sh
      bash deploy-cloud-function-geocode-addresses.sh
  5. Testing/Validating the demo

    • Upload a sample invoice in the input bucket
    • At the end of the processing, you should expect your BigQuery tables to be populated with extracted entities as well as enriched data (i.e placesID, lat, long, formatted address, name, url, description)
    • Reading the results, we can now build custom business intelligence rules using these enriched fields to enable fraud detection. For example, if the address is not something the Geocoding API can find, then it is an indicator of either incorrect value or fraudulent invoice