Skip to content

Project to parse and interpret large-scale real-world data. Purpose is to match the tax code for each item or service offered by a company (Avalara) to UNSPSC tax codes, based on descriptor strings of each item and descriptor strings of each UNSPSC tax code.

tyrannorrec/CS5800-Cordiance-Experiential-Project

Repository files navigation

CS5800-Cordiance-Experiential-Project

Quick Start


Iteration 1: Tree Traversal

  • run matchmaker_tree.py

Iteration 2: Rabin-Karp

To run the program with full descriptions and titles in the input word lists:

  • To run the program with full descriptions and titles in the input word lists:

    1. Make sure lines 32 - 35 are NOT commented out in UNSPSC_structure_dict_unsorted.py
    2. Make sure lines 36 - 39 are commented out in UNSPSC_structure_dict_unsorted.py
    3. Import Avalara_structure as Avalara
    4. Comment out the import statement for Avalara_Structure_titles only as Avalara
    5. run matchmaker.py
  • To run the program with only titles in the input word lists:

    1. Make sure lines 32 - 35 are commented out in UNSPSC_structure_dict_unsorted.py
    2. Make sure lines 36 - 39 are NOT commented out in UNSPSC_structure_dict_unsorted.py
    3. Comment out the import statement for Avalara_structure as Avalara
    4. Import Avalara_Structure_titlesonly as Avalara
    5. run matchmaker.py

Iteration 3: Sorted Prototype

  • To run the program with descriptions included in the UNSPSC word lists:

    1. Make sure lines 32 - 35 are NOT commented out in UNSPSC_structure_dict.py
    2. Make sure lines 36 - 39 are commented out in UNSPSC_structure_dict.py
    3. Run matchmaker_sorted_proto.py
  • To run the program with only titles included in the UNSPSC word lists:

    1. Make sure lines 32 - 35 are commented out in UNSPSC_structure_dict.py
    2. Make sure lines 36 - 39 are NOT commented out in UNSPSC_structure_dict.py
    3. Run matchmaker_sorted_proto.py

Iteration 4: Sorted Final

  • run matchmaker.py

Team Member

Norrec Nieh : tyrannorrec

Jason Zhang : HaozheZhang0818

About

Project to parse and interpret large-scale real-world data. Purpose is to match the tax code for each item or service offered by a company (Avalara) to UNSPSC tax codes, based on descriptor strings of each item and descriptor strings of each UNSPSC tax code.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages