Skip to content

Latest commit

 

History

History
58 lines (36 loc) · 4.01 KB

README.md

File metadata and controls

58 lines (36 loc) · 4.01 KB

Data science before coding

This repository was planned for those who don't know how to code, but work or want to work with data science 🙃

If you already code and want a repo with a faster pace, check out this one 😎

Disclaimer

This is a collaborative repository, created by the students of Instituto Metrópole Digital from UFRN.

The author of each notebook is properly acknowledged 😉

Choosing a tool

Several tools are available for this profile.

In general, they can be grouped into GUI tools and CLI tools:

  • GUI (graphical user interface): All the user interaction is done graphically. These are software like Google Spreadsheets and Orange3.
  • CLI (command-line interface): User interaction is done through a programming language. The main open source languages used in data science are Python, R, and Julia.

A very nice alternative that gathers a bit of both worlds are interactive notebooks, originally from project Jupyter and currently supported also by Google Colaboratory.

This post discusses the main supported languages.

In this repo, we will use notebooks with the Python ecossystem and its main library, Pandas.

The whole material was planned so you don't need to learn how to code, but if you do want to, check out this this repo.

The notebooks in this repository were either created or translated by the authors indicated.

Meeting Pandas

Open In Colab Binder Watch on YouTube

[natanlimas][babschlott] Dataframes as databases

Open In Colab Binder Watch on YouTube

[kallil12][eBetcel] Data analysis and presentation

Open In Colab Binder Watch on YouTube

[mildo][isaacgdo] Extraction, transformation and load (ETL)

Open In Colab Binder

Working with multiples bases

[leobezerra][samuellucas97] Combining information from multiple bases

Open in Colab Binder