Sustainable aviation initiative - Data Science

We push for more sustainable air travel through data-driven insights with data science at Sustainable aviation initiative (SAI). We are a diverse group with volunteers of all ages, interests and backgrounds. Aviation enthusiasts of all areas join forces to make air travel ready for a sustainable future.

This repository collects useful scripts, projects and analyses.

Join us

We would love to hear from you. Leave us a message at https://www.linkedin.com/company/sustainableaviationinitiative.

Social media analysis

Goal

Have one dashboard to summarizes SAI's reach across platforms. KPI is Total unique visitors per week.

Streamlit dashboard

The data analysis is live on https://share.streamlit.io/philippschmalen/sai-data-science .

Structure

./ Project root
├───data
│   ├───interim
│   ├───processed
│   └───raw
├───env             # contains conda environment.yml to create conda env
├───notebooks       
├───references
│   └───images      # used for readme
└───src
    ├───app         # contains streamlit app dashboard.py
    │   └───util    # utility functions
    └───data        # data processing script

Get started (developers)

The following assumes the python package manager Anaconda/Minconda. First,

setup your conda virtual environment to be able to run the code
1. open the anaconda cmd in your project folder
2. navigate to ./env and run conda env create -f environment.yml
3. Ensure that your python builder points to the python.exe from the environment we just created. For example, I use the project settings in Sublime to point to the python.exe of the sai virtual environment. See below for my exemplary project settings.

{
    "build_systems":
    [
        {
            "name": "Anaconda Python Builder",
            "selector": "source.python",
            "shell_cmd": "C:/Users/phili/Miniconda3/envs/sai/python.exe -u \"$file\""
        }
    ],

    "folders":
    [
        {
            "name": "SAI social media analysis",
            "path": "[MY PROJECT PATH]"
        }
    ]
}

ensure that data files exists for Anchor and LinkedIn in ./data/raw/
You find the data processing script in ./src/data/process_data.py. Change the data directories according to your OS and needs.

Develop Streamlit app locally

Refer to Streamlit's official get started guide.

Follow the above steps to install environment or run

pip install streamlit pandas plotly

Go to ./src/app, open an anaconda prompt conda activate sai and

streamlit run streamlit_app.py

A browser should open and show

Deploy with Streamlit Share

A checklist for deployment on Streamlit Sharing:

Ensure a public repository
A repository with submodules or subdirectories will not work
Put requirements.txt into the repository root folder ./ with a minimum amount of packages, for example
```
pandas==1.2.3
pyyaml==5.4.1
streamlit==0.79.0
plotly==4.14.1
```

Common mistakes are incorrect filenames, such as requirement.txt or requirements.txt.txt instead of requirements.txt. 4. Any files called from within ./src/app/streamlit_app.py have to be called relative to the root folder of the repo, ./. Relative paths from like navigating one level up like ../ within streamlit_app.py will not work. Reference files from the repository root ./, for example: python # DOES'T WORK: call settings.yml with relative path one level up with open('../settings.yml') as file: [...] # WORK: call settings.yml from repo root dir with open('./src/settings.yml') as file:

How to add new data

To add new data points after a certain time, proceed as follows. Important: Store the data in the directory, where ./src/data/process_data.py expects it to be. This is ./data/raw by default.

AnchorFM

LinkedIn

Twitter

You find the analytics data as follows:

More > analytics > Tweets Select last 28 days and Export data by day to the project data folder ./data/raw.

Purpose of the jupyter `./notebooks`

Explore ways to merge social media data across platforms. Make daily follower statistics from linkedin and anchorfm (manually) accessible. Focus on the total sum, such as Total page views or Total plays.

Implementation

load files from directory, where the raw data lives (e.g. ../../data/raw)
create a common date column with datetime format UTC
drop date duplicates
transform into long format which yields:

date platform value

0 2020-10-08 00:00:00+00:00 anchor 0.0

1 2020-10-09 00:00:00+00:00 anchor 0.0

2 2020-10-10 00:00:00+00:00 anchor 0.0
export as csv, e.g. to ../../data/processed

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
data		data
notebooks		notebooks
references/images		references/images
src/data		src/data
streamlit_utils		streamlit_utils
.gitignore		.gitignore
Readme.md		Readme.md
requirements.txt		requirements.txt
run_streamlit_app.bat		run_streamlit_app.bat
settings.yml		settings.yml
streamlit_app.py		streamlit_app.py

	date	platform
0	2020-10-08 00:00:00+00:00	anchor
1	2020-10-09 00:00:00+00:00	anchor
2	2020-10-10 00:00:00+00:00	anchor

philippschmalen/SAI-SoMe-Analysis

Folders and files

Latest commit

History

Repository files navigation

Sustainable aviation initiative - Data Science

Join us

Social media analysis

Goal

Streamlit dashboard

Structure

Get started (developers)

Develop Streamlit app locally

Deploy with Streamlit Share

How to add new data

AnchorFM

LinkedIn

Twitter

Purpose of the jupyter ./notebooks

Implementation

About

Topics

Resources

Stars

Watchers

Forks

Languages

Purpose of the jupyter `./notebooks`