Skip to content

Everything about 📊Data Science, 🎓Computer Science, 🧮Mathematics, 🔬BioInformtics and 🤖Machine learning+ (Final projects, Projects idea, How to pass Interviews, Specializations, Roadmaps, and Open source curriculums)

License

Notifications You must be signed in to change notification settings

mejbass/CyberPolyglot-IT-Hub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Master IT   -   Mejbass     Awesome

image

Contents

About

This is a path for those of you who want to complete the Data Science undergraduate curriculum on your own time, for free, with courses from the best universities in the World.

In our curriculum, we give preference to MOOC (Massive Open Online Course) style courses because these courses were created with our style of learning in mind.

Motivation & Preparation

Here are two interesting links that can make all the difference in your journey.

The first one is a motivational video that shows a guy that went through the "MIT Challenge", which consists of learning the entire 4-year MIT curriculum for Computer Science in 1 year.

The second link is a MOOC that will teach you learning techniques used by experts in art, music, literature, math, science, sports, and many other disciplines. These are fundamental abilities to succeed in our journey.

Are you ready to get started?

Prerequisites

The Data Science curriculum assumes the student has taken high school math and statistics.

Curricular Guideline

OSSU Data Science uses the report Curriculum Guidelines for Undergraduate Programs in Data Science as our guide for course recommendation

Curriculum

Introduction to Data Science

What is Data Science

Courses Duration Effort
Introduction to Data Science 8 weeks 10-12 hours/week
Data Science - CS109 from Harvard 12 weeks 5-6 hours/week
The Analytics Edge 12 weeks 10-15 hours/week

Introduction to Computer Science

Students who already know basic programming in any language can skip this first course

Introduction to programming

Introduction to Computational Thinking and Data Science

:octocat: Bonus From the 🎓Computer Science Curriculum 😋

Introduction to Computer Science This course will introduce you to the world of computer science. Students who have been introduced to programming, either from the courses above or through study elsewhere, should take this course for a flavor of the material to come. If you finish the course wanting more, Computer Science is likely for you! Topics covered:

computation imperative programming basic data structures and algorithms and more

Courses Duration Effort Prerequisites Discussion
Introduction to Computer Science and Programming using Python (alternative) 9 weeks 15 hours/week high school algebra chat

Data Structures and Algorithms

The Algorithms courses are taught in Java. If students need to learn Java, they should take this course first

Java Programming

Algorithms, Part I

Algorithms, Part II

Databases

Database Management Essentials

Data Warehouse Concepts, Design, and Data Integration

Relational Database Support for Data Warehouses

Business Intelligence Concepts, Tools, and Applications

Design and Build a Data Warehouse for Business Intelligence Implementation

MongoDB for Developers Learning Path

Courses Duration Effort
Stanford's Database course - weeks 8-12 hours/week

:octocat: Bonus From 🎓Computer Science Curriculum 😋

Core applications

Topics covered:

Agile methodology REST software specifications refactoring relational databases transaction processing data modeling neural networks supervised learning unsupervised learning OpenGL ray tracing and more

Courses Duration Effort Prerequisites Discussion
Databases: Modeling and Theory 2 weeks 10 hours/week core programming chat
Databases: Relational Databases and SQL 2 weeks 10 hours/week core programming chat
Databases: Semistructured Data 2 weeks 10 hours/week core programming chat
Machine Learning 11 weeks 9 hours/week Basic coding chat

:octocat: Bonus From 🔬BioInformatics Curriculum 😋

Code Course Duration Effort
COMP 2312 Databases 10 Weeks 8-12 Hours/Week

Single Variable Calculus

Calculus 1A: Differentiation

Calculus 1B: Integration

Calculus 1C: Coordinate Systems & Infinite Series

:octocat: Bonus From The 🧮Maths Curriculum 😋

Courses Duration Effort Prerequisites
Multivariable Calculus 12 weeks 6 hours/week Calculus 1C

Linear Algebra

Topics covered: Vector and matrix calculations Linear transformations Vector spaces Eigenvalues and Eigenvectors

Essence of Linear Algebra

Linear Algebra

Courses Duration Effort
Linear Algebra - Foundations to Frontiers 15 weeks 8 hours/week
Applications of Linear Algebra Part 1 5 weeks 4 hours/week
Applications of Linear Algebra Part 2 4 weeks 5 hours/week

:octocat: Bonus From 🔬BioInformatics Curriculum 😋

Code Course Duration Effort
MATH 1311 College Algebra and Problem Solving 4 Weeks 6 Hours/Week

Multivariable Calculus

Multivariable Calculus

Python

Courses Duration Effort
Introduction to Computer Science and Programming Using Python 9 weeks 15 hours/week
Introduction to Computational Thinking and Data Science 10 weeks 15 hours/week
Introduction to Python for Data Science 6 weeks 2-4 hours/week
Programming with Python for Data Science 6 weeks 3-4 hours/week

Statistics & Probability

Introduction to Probability

Statistical Reasoning| - weeks | - hours/week

Intro to Descriptive Statistics

Intro to Inferential Statistics

Introduction to Statistics: Probability| 5 weeks | - hours/week

Introduction to Statistics: Inference| 5 weeks | - hours/week

Statistical Learning with Python by Stanford University on EdX or Statistical Learning With R by Stanford University on EdX

:octocat: Bonus From The 🧮Maths Curriculum 😋

Probability & Statistics

Probability is the mathematics of uncertainty. Statistics is the mathematical framework for quantifying uncertainty in real-world data. These two related but distinct fields of study help us describe variation and uncertainty in the world around us. These courses make heavy use of discrete mathematics, linear algebra, and calculus, and serve as a first opportunity to apply what you've learned in the other core courses.

Topics covered:

Random variables Expectation and Variance Probability Distributions

Courses Duration Effort Prerequisites
Probability 14 weeks 12-16 hours/week Multivariable Calculus, Math for Computer Science, Linear Algebra
Statistics for Applications 14 weeks 12-16 hours/week Probability

:octocat: Bonus From The 🧮Maths Curriculum 😋

Introduction to Analysis

Analysis is the mathematics of sequences and limits. Intro to Analysis is a course that builds on the concepts of Calculus and provides a rigorous and formalized study of the foundations of Calculus. This course will use formal proofs to establish mathematical results, starting by proving the existence of real numbers and building the foundation of single-variable Calculus from scratch.

Topics covered:

Proofs Real analysis

Courses Duration Effort Prerequisites
Introduction to Analysis 14 weeks 8-10 hours/week Multivariable Calculus
Supplemental Lecture Videos 16 weeks 8-10 hours/week Multivariable Calculus

:octocat: Bonus From 🔬BioInformatics Curriculum 😋

Code Course Duration Effort
MATH 1315 Introduction to Probability and Data (with R) 5 Weeks 6 Hours/Week
MATH 2314 Inferential Statistics (with R) 5 Weeks 6 Hours/Week
MATH 3311 Linear Regression and Modeling (with R) 4 Weeks 6 Hours/Week
MATH 3312 Bayesian Statistics (with R) 5 Weeks 6 Hours/Week

Data Science Tools & Methods

Tools for Data Science

Data Science Methodology

Data Science: Wrangling

Machine Learning/Data Mining

Machine Learning

Intro to Machine Learning

Mining Massive Datasets

Process Mining

Courses Duration Effort
Learning From Data (Introductory Machine Learning) [caltech] 10 weeks 10-20 hours/week
Statistical Learning - weeks 3 hours/week
Stanford's Machine Learning Course - weeks 8-12 hours/week

:octocat: Bonus From 🔬BioInformatics Curriculum 😋

Code Course Duration Effort
COMP 2312 Databases 10 Weeks 8-12 Hours/Week
COMP 4311 Data Science 13 Week 10 Hours/Week
COMP 5312 Deep Learning 8 Weeks 6 Hours/Week
Extension Genomic Data Science Specialization 32 Week 6 Hours/Week

Final project

OSS University is project-focused. The assignments and exams for each course are to prepare you to use your knowledge to solve real-world problems.

After you've gotten through all of Core CS and the parts of Advanced CS relevant to you, you should think about a problem that you can solve using the knowledge you've acquired. Not only does real project work look great on a resume, but the project will also validate and consolidate your knowledge. You can create something entirely new, or you can find an existing project that needs help via websites like CodeTriage or First Timers Only.

Students who would like more guidance in creating a project may choose to use a series of project oriented courses. Here is a sample of options (many more are available, at this point you should be capable of identifying a series that is interesting and relevant to you):

Project

Complete Kaggle's Getting Started and Playground Competitions

Convex Optimization

Courses Duration Effort
Convex Optimization 9 weeks 10 hours/week

Data Wrangling

Courses Duration Effort
Data Wrangling with MongoDB 8 weeks 10 hours/week

Big Data

Courses Duration Effort
Intro to Hadoop and MapReduce 4 weeks 6 hours/week
Deploying a Hadoop Cluster 3 weeks 6 hours/week

Database

Courses Duration Effort
Stanford's Database course - weeks 8-12 hours/week

Natural Language Processing

Courses Duration Effort
Deep Learning for Natural Language Processing - weeks - hours/week

Deep Learning

Courses Duration Effort
Deep Learning 12 weeks 8-12 hours/week

Capstone Project

  • Participate in Kaggle competition
  • List down other ideas

Specializations

After finishing the courses above, start your specializations on the topics that you have more interest. You can view a list of available specializations here.

Courses Duration Effort Prerequisites
Data Mining (Specialization) 30 weeks 2-5 hours/week machine learning
Big Data (Specialization) 30 weeks 3-5 hours/week none
Internet of Things (Specialization) 30 weeks 1-5 hours/week strong programming
Cloud Computing (Specialization) 30 weeks 2-6 hours/week C++ programming
Data Science (Specialization) 43 weeks 1-6 hours/week none
Functional Programming in Scala (Specialization) 29 weeks 4-5 hours/week One year programming experience
Game Design and Development with Unity 2020 (Specialization) 6 months 5 hours/week programming, interactive design

How to use this guide

Duration

It is possible to finish within about 2 years if you plan carefully and devote roughly 20 hours/week to your studies. Learners can use this spreadsheet to estimate their end date. Make a copy and input your start date and expected hours per week in the Timeline sheet. As you work through courses you can enter your actual course completion dates in the Curriculum Data sheet and get updated completion estimates.

Order of the classes

Some courses can be taken in parallel, while others must be taken sequentially. All of the courses within a topic should be taken in the order listed in the curriculum. The graph below demonstrates how topics should be ordered.

Topic Progression Graph

Which programming languages should I use?

Python and R are heavily used in Data Science community and our courses teach you both. Remember, the important thing for each course is to internalize the core concepts and to be able to use them with whatever tool (programming language) that you wish.

Optional Bioinformatics Courses

1st Year

Code Course Duration Effort
Py4E Python for Everybody 10 weeks 10 hours/week
6.00.1x Introduction to Computer Science and Programming using Python (alt) 9 weeks 15 hours/week
MATH 1311 College Algebra and Problem Solving 4 Weeks 6 Hours/Week
MATH 1312 Pre-calculus 4 Weeks 6 Hours/Week
18.01.1x Calculus 1A: Differentiation 13 weeks 6-10 hours/week
18.01.2x Calculus 1B: Integration 13 weeks 5-10 hours/week
MATH 1315 Introduction to Probability and Data (with R) 5 Weeks 6 Hours/Week

2nd Year

Code Course Duration Effort
18.01.3x Calculus 1C: Coordinate Systems & Infinite Series 6 weeks 5-10 hours/week
6.042J Mathematics for Computer Science (Solutions) 13 weeks 5 hours/week
COMP 2312 Databases 10 Weeks 8-12 Hours/Week
18.06 Linear Algebra and Essence of Linear Algebra 14 weeks 12 hours/week
COMP 2313 Introduction to Linux 8 Weeks 5-7 Hours/Week
MATH 2314 Inferential Statistics (with R) 5 Weeks 6 Hours/Week

3rd Year

Code Course Duration Effort
COMP 3311a Algorithmic Thinking 1 4 Weeks 6 Hours/Week
COMP 3311b Algorithmic Thinking 2 4 Weeks 6 Hours/Week
MATH 3311 Linear Regression and Modeling (with R) 4 Weeks 6 Hours/Week
MATH 3312 Bayesian Statistics (with R) 5 Weeks 6 Hours/Week
MATH 3313 Differential Equations 7 Weeks 8-10 Hours/Week

4th Year

Code Course Duration Effort
COMP 4311 Data Science 13 Week 10 Hours/Week

Extra Year

Code Course Duration Effort
COMP 5311 Introduction to Machine Learning 10 Weeks 6 Hours/Week
COMP 5312 Deep Learning 8 Weeks 6 Hours/Week
Extension Genomic Data Science Specialization 32 Week 6 Hours/Week

We also have labels to help you have more control through the process. The meaning of each of these labels is:

  • Main Curriculum: cards with that label represent courses that are listed in our curriculum.
  • Extra Courses: cards with that label represent courses that was added by the student.
  • Doing: cards with that label represent courses the student is current doing.
  • Done: cards with that label represent courses finished by the student. Those cards should also have the link for at least one project/article built with the knowledge acquired in such course.
  • Section: cards with that label represent the section that we have in our curriculum. Those cards with the Section label are only to help the organization of the Done column. You should put the Course's cards below its respective Section's card.
  • Extra Sections: cards with that label represent sections that was added by the student.

The intention of this board is to provide for our students a way to track their progress, and also the ability to show their progress through a public page for friends, family, employers, etc. You can change the status of your board to be public or private.

Should I take all courses?

Yes! The intention is to conclude all the courses listed here! Also we highly encourage you to complete more by reading papers and attending research projects after your coursework is done.

Which programming languages should I use?

List of skills:

  • C/C++
  • Unix System
  • Python/Perl
  • R
  • Algorithms

These skills mentioned above are the very essential tool set that bioinformatician and computational biologist depends on.

The important thing for each course is to internalize the core concepts and to be able to use them with whatever tool (programming language) that you wish.

Bonus From Maths Curriculum

Curriculum

The curriculum is separated into two parts:

Advanced Topics

Upon finishing all the core mathematics courses, students can choose to take elective courses in advanced topics of their choice. It is not necessary to take every course within a subcategory, but it is recommended to take courses relevant to the intended field of study.

To complete your study of Advanced Topics, meet both the Breadth and Depth requirements.

  • Breadth Requirement: For each of the 6 Advanced Topics below, select one course to take as an elective.
  • Depth Requirement: Select one Advanced Topic below and take 3 additional courses from that topic.

Mathematical Logic

Courses Duration Effort Prerequisites
Introduction to Formal Logic 15 weeks 9 hours/week -

Probability and Statistics

Combinatorics, probability, statistics, game theory, applied stats

Mathematical Analysis

Real analysis, numerical analysis, complex analysis, optimization theory

Abstract Algebra

Abstract algebra, category theory, algebraic geometry and topology

Releases

No releases published

Packages

No packages published