Skip to content

edX--MSDS-Curriculum--DAT206x for DAT206x Analyzing and Visualizing Data with Excel course on edX as part of the Microsoft Data Science curriculum

Notifications You must be signed in to change notification settings

achoczaj/edX--MSDS-Curriculum--DAT206x

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 

Repository files navigation

edX--MSDS-Curriculum--DAT206x

for DAT206x Analyzing and Visualizing Data with Excel course on edX as part of the Microsoft Data Science curriculum DAT206x Analyzing and Visualizing Data with Excel

Excel is one of the most widely used solutions for analyzing and visualizing data. Beginning with Excel 2010, new tools were introduced to enable the analysis of more data, to improve visualizations and to enable more sophisticated business logic.

In this course, you will learn about latest versions of these tools in Excel 2016. You will see how to import data from different sources, create mashups between data sources, and prepare the data for analysis. You will learn about how business calculations—from simple to more advanced—can be expressed using the DAX calculation engine. You will also learn how the data can be visualized in Excel and shared to the Power BI cloud service.

The course is designed for self-paced study of around 2-4 hours per week for six weeks, including lectures, quizzes, labs and further readings. All quizzes and lab exercises are graded. The quizzes account for 30% of the total grade, the lab exercises accounts for 65% of the total grade, and the mandatory survey accounts for the remaining 5%. You must achieve an overall score of 70% to pass the course.

Course Outline

Module 1: Data Analysis in Excel

  • Perform data analysis in Excel using classic tools, such as VLOOKUP function, pivot tables, pivot charts, and slicers, on data that is already in a worksheet / grid data.
    • Use VLOOKUP to combine data into one range.
    • Connect a slicer with pivot tables and pivot charts components.
    • Conditionally format cells.

Lab 1: Explore and Extend a Classic Excel Dashboard

See the "Dashboard - CA" worksheet, showing six pivot charts of different types, with associated slicers to filter the data. Play around with the slicers to select different filters and see how your choices affect the charts. Unhide the hidden worksheets to view the data source of the charts.

  • Exercise 1: Explore the Classic Excel Dashboard
  • Exercise 2: Extend the Classic Excel Dashboard

Module 2: The Excel Data Model

  • Build a Excel Data Model from a single flat table

    • use queries (Power Query add-in in Excel 2013 and Excel 2010)
    • create a calculated column on a table in Excel Data Model
  • Manage Excel Data Model

    • Diagram view of Excel Data Model
    • Types of relationships supported in the Excel Data Model
  • Basic Data Analysis Expressions (DAX)

    • use DAX in Calculated columns
    • use DAX in Measures (in Excel 2013, measures are called calculated fields)

    Measure (in Excel) is A calculation that you create for the purpose of measuring an outcome or result relative to other factors.

    • create implicit measures

    An implicit measure is one that Excel generates for you when you add fields to the Values area of a pivot table.

    • create explicit measures

    An explicit measure is one that you create manually using e.g. DAX.

    • RELATED function (DAX)

    RELATED function (DAX) returns a single value that is related to the current row from another table.

Lab 2: Explore an Excel Data Model

In this lab, you will explore an Excel workbook that has a data model loaded into it. You will also create calculated columns in the data model, apply formatting, and create implicit and explicit measures. You will then use the data model to create pivot tables and perform some analysis with the data.

  • Exercise 1: Explore the Excel Data Model
  • Exercise 2: Create a Pivot Table
  • Exercise 3: Create Measures

Module 3: Importing Data from a CSV File to Data Model (a PDF using Flash Fill)

  • Importing Data from a CSV / XML File to Data Model

    • data pre-processing steps recorded in the Query Editor: Remove columns, Split column, Replace values
  • Remarks on Using Excel 2010 to import data

    In Excel 2010, you cannot bring the data directly to Power Pivot using Power Query. When you import data using Power Query, you can either import to Excel worksheet or only create a connection. And then, from the Power Pivot ribbon you can use Create Linked Table to add the table to the Power Pivot model.

  • Importing Data from a PDF to Data Model using Flash Fill

    To learn more about Flash Fill, check out the following resource: https://support.office.com/en-za/article/Flash-Fill-3fb96b4a-ee83-4493-af45-6522324477bd

Lab 3: Importing Data from a CSV File

In this lab, you will import data to Excel from a flat csv file. You will perform pre-processing steps with the data prior to loading it into Excel.

  • Exercise 1: Import Data from a CSV File
  • Exercise 2: Create Pivot Table(s) to Perform Analysis

Module 4: Importing Data from Databases

  • Importing Data from Databases

    • import multiple tables from a SQL database, and create an Excel data model from the imported data
    • data sources included as built-in options for queries (e.g. SharePoint list, Hadoop file, Active Directory, Salesforce Objects)
  • Importing Data from Multiple Files

    • initial step you need to do in the Query Editor before loading the data into the Excel data model

    Filter the header from the three other CSV files.

  • Creating and using a Calendar Table in a data model

    • Two ways to create a calendar table in Excel 2016:
      • New Date Table function from the Design tab of the Power Pivot for Excel window
      • create a calendar table in the worksheet and add it to the Excel data model
    • Purposes of using a calendar table in a data model:
      • filter data by year, month, or week
      • perform advanced calculations such as year-over-year comparison

Lab 4: Creating Mash-ups of Data from Multiple Sources

This lab comprises of three exercises: In the first exercise, you will import data to Microsoft Excel from a SQL database on Azure. Once you have imported the data, you will explore existing table relationships and create a new one yourself. In the second exercise, you will import data from CSV files which resides in a file folder. You will append this new data to the corresponding existing data that comes from the SQL Database. In the third exercise, you will create a Date table in the data model to be used for data analysis.

  • Exercise 1: Import Data from SQL Database and Create Table Relationship
  • Exercise 2: Import Data from a Folder Containing CSV Files
  • Exercise 3: Create a Date Table

Module 5: Creating and Formatting Measures

Lab 5:

In this lab, you will write several DAX expressions to create measures to be used to analyze VanArsdel’s sales data. Specifically, you will create the following measures:

  • Total Sales: calculates the total sales.
  • LY Sales: calculates last year sales.
  • Sales Var: calculates sales variance between this year and last year sales.
  • Sales Var %: calculates sales variance between this year and last year sales in percentage.
  • YTD Sales: calculates YTD sales.
  • LY YTD Sales: calculates last year YTD sales.
  • YTD Sales Var: calculates sales variance between this year and last year YTD sales.
  • YTD Sales Var %: calculates sales variance between this year and last year YTD sales in percentage.
  • Total VanArsdel Sales: calculates sales for VanArsdel manufactured goods.
  • % Sales Market Share: calculates the percentage of VanArsdel manufactured goods from the total sales.

Exercises:

  • Exercise 1: Last Year Comparison

    Total Units: Total Units:=SUM([Units])

    LY Total Units: LY Total Units:=CALCULATE([Total Units],SAMEPERIODLASTYEAR('Calendar'[Date]))

    Total Units Var: Total Units Var:=[Total Units]-[LY Total Units]

    Total Units Var %: Total Units Var %:=DIVIDE([Total Units Var],[LY Total Units])

  • Exercise 2: Year to Date

    YTD Total Units: YTD Total Units:=TOTALYTD([Total Units],'Calendar'[Date])

    LY YTD Total Units: LY YTD Total Units:=CALCULATE([YTD Total Units],SAMEPERIODLASTYEAR('Calendar'[Date]))

    YTD Total Units Var: YTD Total Units Var:=[YTD Total Units]-[LY YTD Total Units]

    YTD Total Units Var %: YTD Total Units Var %:=DIVIDE([YTD Total Units Var],[LY YTD Total Units])

  • Exercise 3: Own Brand (VanArsdel) Sale Market Share

    Total OwnBrand Units: Total VanArsdel Units:=CALCULATE([Total Units], Manufacturer[Manufacturer]="VanArsdel")

    % OwnBrand Units Market Share: % Units Market Share:=IF([Total VanArsdel Units]=0, 0, DIVIDE([Total VanArsdel Units], [Total Units], 0))


Module 6: Importing Data from a Formatted Excel Report

  • Features in Query Editor
    • using First Row As Headers feature in Query Editor to create table headers from one of the rows
    • using Unpivot Columns feature to transform selected columns into attribute-value pairs
    • using Filters feature to remove rows based on their values

Lab 6: Advanced Text Query

In this lab, you will import data from a formatted Excel file to a blank Excel file.

  • Exercise 1: Import Data from a Formatted Excel File
    1. Import the data from the Excel file by creating a new Query.
    2. Edit the query and perform the following steps:
    • Name the Query IndustryReport.
    • Remove the first and second rows of the table.
    • Promote the row that represents the title of the column to the table headers.
    • Fill the Category column with the correct categories.
    • Remove all the rows that represent total values.
    • Remove the remaining rows that do not represent data.
    • Remove the column that represent total values.
    • Remove the remaining columns that do not represent data.
    • Transform the monthly columns to rows and name the resulting columns appropriately.
    • Rename the resulting columns appropriately.
    1. Load the data into Excel data model.

Module 7:

Lab 7:


Module 8:

Lab 8:


My notes

Course Progress for Student 'achoczaj'

Your enrollment: Audit track

Course start: 2017.02.20

Course end: 2017.

Course progress: Total result = %

About

edX--MSDS-Curriculum--DAT206x for DAT206x Analyzing and Visualizing Data with Excel course on edX as part of the Microsoft Data Science curriculum

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published