ericgordo/Python_Enron_ML_Classifier
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This project, is my first attempt to utilize machine learning skills to build an algorithm in order to identify Enron Employees who may have committed fraud based on the public Enron financial and email dataset. This Repository contains the files used to create my algorithm, as well as some additional files to document my work. The files in this repository are listed below: poi_id.py —> This file is the final stand alone python code to create the machine learning algorithm classifier, feature list, and dataset to run in the tester code. feature_format.py —> helper Code to format data in poi_id.py, and in Notebooks. tester.py —> this python file can be run after poi_id.py to test the machine learning algorithm. Just_Code_Notebook —> All investigation and experimentation was done in a Jupyter IPython Notebook. These files contain Just the code that was run to find the optimized Machine Learning Algorithm. Final_Notebook —> Includes all code found in the Just_Code_Notebook, but also includes my thoughts and written notes and opinions throughout the piece. Final Conclusions found in this Notebook. Resources and Work Sided: Enron- Email Dataset: https://www.cs.cmu.edu/~./enron/ Article to identify POI’s from Enron: http://usatoday30.usatoday.com/money/industries/energy/2005-12-28-enron-participants_x.htm General Machine Learning Support and Help: https://www.udacity.com/course/intro-to-machine-learning--ud120 (And all Forums Included) Python Coding Support and Help: http://stackoverflow.com/
About
A Project Utilizing Machine Learning Techniques
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published