Skip to content

I am sharing foundational concepts of SQL, just like I am doing with Python, as part of my journey to become a Data Analyst. These are topics that I have studied and continue to study.

License

Notifications You must be signed in to change notification settings

soareseric/learn-sql-for-data-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Learn SQL for Data Analysis

I am building this repository for study purposes. I am on the journey to become a Data Analyst, and I want to share what I have learned along the way.

SQL

The Structured Query Language (SQL) was first developed in 1970 by researchers at IBM. The initial version was created to manage and retrieve data from IBM's original relational databases called "System R". A few years later, the SQL language became publicly available. The American National Standards Institute (ANSI) and the International Standard Organization (ISO) then took the SQL language as the standard for communicating with relational databases. While some DBMS (Database Management Systems) have altered the language, the majority still follows the ANSI-approved version of SQL programs.

About the Content

The content of this repository covers the essential SQL commands. Some of the material is based on the book "Getting Started with SQL" by Thomas Nield, published by O'Reilly. However, the SQL code will always be written using MySQL, one of the many DMBS for relational databases available in the market. In some cases, I will highlight the differences in queries and commands between MySQL, SQL Server, and SQLite. Additionally, we will delve into SQL and some NoSQL databases for data analysis.

Note 1: All of this content is written in the form of "notes" by me. I encourage you to take some time and read the official documentation for the SQL language. I will always provide these references and others at the end of each topic.

Note 2: At the time of viewing this, I may still be in the process of developing the content and updating this repository. As a result, you may encounter some unclickable topics. The updates to these topics do not follow a specific order and are based on my needs and study progress. Follow me to stay updated on new upgrades.

Prerequisites

  1. Install MySQL Community Server. and the MySQL Worbench.
  2. Have access and know how to use terminal/command line.

Some of the Basics Commands

DQL (Data Query Language)

  • SELECT: Retrieve data from database.

DML (Data Manipulation Language)

  • INSERT: Insert data into a table.
  • DELETE: Delete data from a database table.
  • UPDATE: Update an existing data within a table.

DDL (Data Definition Language)

  • ALTER: Change the structure of the database.
  • CREATE: Create databases or objects, like tables and views.
  • DROP: Delete objects from database.
  • RENAME: This is used to rename an object existing in the database.

DCL (Data Control Language)

  • GRANT: Give privileges access to database.
  • REVOKE: Withdraws the user's access privileges given by using the GRANT command.

TCL (Transaction Control Language)

  • COMMIT: Commits a Transaction.
  • ROLLBACK: Rollbacks a transaction in case of any error occurs.
  • SAVEPOINT: Sets a save point within a transaction

Queries examples

Sample use case:

You are working for a logistics company that sends emails to customers to inform them about the status of their packages. The company has a database of all the emails sent, which includes the email ID, recipient email address, subject, body, and sending date.

  1. Create the database and its tables:
    CREATE DATABASE logistics_email;
    
    USE logistics_email;
    
    CREATE TABLE emails (
        email_id INT AUTO_INCREMENT PRIMARY KEY,
        recipient VARCHAR(255) NOT NULL,
        subject VARCHAR(255) NOT NULL,
        body TEXT NOT NULL,
        sent_date DATE NOT NULL
     ); 
  1. Insert values to those tables:
    INSERT INTO emails (recipient, subject, body, sent_date)
    VALUES
          ('recipient1@email.com', 'Package Update', 'Your package has been shipped', '2022-01-01'),
          ('recipient2@email.com', 'Package Delivery', 'Your package has been delivered', '2022-01-02'),
          ('recipient3@email.com', 'Package Delay', 'Your package has been delayed', '2022-01-03'),
          ('recipient4@email.com', 'Package Update', 'Your package is on its way', '2022-01-04'),
          ('recipient5@email.com', 'Package Arrival', 'Your package has arrived', '2022-01-05');
  1. Update the email with the subject "Package Delivery" to "Package Delivery Update":
    UPDATE emails
    SET subject = 'Package Delivery Update'
    WHERE email_id = 2;
  1. Add a new column called "status" with "Pending" as its default value:
    ALTER TABLE emails
    ADD COLUMN status VARCHAR(20) NOT NULL DEFAULT 'Pending';
  1. Retrieve the number of emails sent to each recipient email address and the average length of the email bodies:
    SELECT 
        recipient,
        COUNT(email_id) as total_emails,
        AVG(LENGTH(body)) as avg_body_length
    FROM emails
    GROUP BY recipient;
  1. Retrieve the number of emails sent to each recipient email address and the average length of the email bodies only for the emails sent in the last week:
    SELECT 
        recipient,
        COUNT(email_id) as total_emails,
        AVG(LENGTH(body)) as avg_body_length
    FROM emails
    WHERE sent_date >= DATE_SUB(NOW(), INTERVAL 7 DAY)
    GROUP BY recipient;

Table of Contents

  1. Select, Filter, Order and Operationals
  2. Handling with Variables
  3. Joining Tables
  4. Aggregating Data for Analysis
  5. Windows Function, Subqueries and Handling with date
  6. Exploratory Data Analysis
  7. Cleaning and Processing Data
  8. Data Analysis
  9. Programming
  10. Optimizing SQL Queries

Contact me 🔗 👇

EricSoares EricSoares

About

I am sharing foundational concepts of SQL, just like I am doing with Python, as part of my journey to become a Data Analyst. These are topics that I have studied and continue to study.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published