Skip to content

Machine Learning Project to build an algorithm which identifies Enron Employees who may have committed fraud based on the public Enron financial and email dataset.

License

Notifications You must be signed in to change notification settings

Ashish25/ML_Spam_Detection

Repository files navigation

Identify Fraud from Enron Email

Enron Scandal: The Fall of a Wall Street Darling

alt text

Project Overview

Played detective role :shipit: and put my machine learning skills to use by building an algorithm to identify Enron Employees who may have committed fraud based on the public Enron financial and email dataset.

In 2000, Enron was one of the largest companies in the United States. By 2002, it had collapsed into bankruptcy due to widespread corporate fraud. In the resulting Federal investigation, a significant amount of typically confidential information entered into the public record, including tens of thousands of emails and detailed financial data for top executives. In this project, you will play detective, and put your new skills to use by building a person of interest identifier based on financial and email data made public as a result of the Enron scandal. To assist you in your detective work, we've combined this data with a hand-generated list of persons of interest in the fraud case, which means individuals who were indicted, reached a settlement or plea deal with the government, or testified in exchange for prosecution immunity.

Highlight of the projet:

  • Deal with an imperfect, real-world dataset (Class Imbalance problem)
  • Validate a machine learning result using test data (K-fold cross validation, SelectKBest
  • Evaluate a machine learning result using quantitative metrics (Accuracy-Precision-Recall)
  • Create, select and transform features (sklearn.preprocessing)
  • Compare the performance of few machine learning algorithms (Naive Bayes, SVM, DecisionTree)
  • Tune machine learning algorithms for maximum performance
  • Communicate your machine learning algorithm results clearly

About

Machine Learning Project to build an algorithm which identifies Enron Employees who may have committed fraud based on the public Enron financial and email dataset.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published