Skip to content

Final project for R course at Hult, conducting an analysis about whether cybersecurity is still a business problem, specifically about passwords.

Notifications You must be signed in to change notification settings

ducn1806/Passwords

Repository files navigation

Passwords: should we care about them?

This was the final project for Data Science: R (DAT-5301) course at Hult International Business School. It focused on exploring 3 datasets consisting of password strengths, ranks, categories and data breaches. The purpose of the project was to conduct a thorough data analysis on why passwords are still a business -cybersecurity- concern and share insightful conclusions.


Project Requirements

Framing the Problem:

  1. Problem recognition:
  • What is the Business problem that you can analyze from this dataset? Why is it relevant?
  1. Review of Previous findings:
  • What does your research guide you into? Are there key insights that you found from your research about the Busines problem? – This will be the area where, as a team, you would look into Business articles (WSJ / Economist / Financial times) to highlight about the business problem that you are trying to explore.
  • What is the Testable Hypothesis / Thought process that you established based on your initial research? Your analysis can be predictive or inferable. If your analysis is predictive, there would not be a hypothesis, instead it would report model performance.

Solving the Problem:

  1. Variable Selection: Introduce your Data using key attributes. What is the data about?

  2. Data collection: What are the data sources that you collected?

  3. Data Analysis: Summarization and Visualization (5-7 charts / analyses)

  • What are the key trends and patterns that you find about the data? Each trend /chart should have 3-4 lines about why is that trend/chart important. How does it add value to your Data Analysis project?
  • Are there Outliers in your data? What charts/visualization did you use to identify them? How did you handle your Outliers?
  1. What are the updates/ modifications that you did to your initial hypothesis/ thought process after Summarization and Visualization?

Modelling and Communication:

  1. Modelling: (OLS and / Logistic) to identify relations /connections in the data

  2. Results presentation:

  • Validate your Hypothesis / thought process. What are your inferences / model performance?
  • Preparing your R markdown for presentation
  • What are your 3# specific insights for the data analysis? Connect your data analysis from Stage 1 and Modelling from Stage 3 to support your findings. It is also expected that you use with domain knowledge (i.e. research from external sources). Make sure to site your sources.

About

Final project for R course at Hult, conducting an analysis about whether cybersecurity is still a business problem, specifically about passwords.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages