Skip to content

Commit

Permalink
Draft an overview of v1 release features
Browse files Browse the repository at this point in the history
WIP...
  • Loading branch information
e-lo committed Apr 16, 2024
1 parent 6131994 commit 166bac7
Showing 1 changed file with 252 additions and 0 deletions.
252 changes: 252 additions & 0 deletions notebook/Release Overview WranglerV1.ipynb
@@ -0,0 +1,252 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Release Overview"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Release Overview of Wrangler v1.0\n",
"\n",
"*Compared to pre 1.0*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Features\n",
"\n",
"1. I/O and Mutation Speed\n",
"2. Flexible serialization formats\n",
"3. Improved stability\n",
"4. Selection flexibility for transit based on any feature or by link or node\n",
"5. More flexible and fast transition to geodataframes for (almost) any data part\n",
"6. Easy data clipping to geographic bounds\n",
"7. Scripts for actions you might want to execute from command line (i.e. data conversion/clipping/etc)\n",
"8. Logging \n",
"9. Error directions\n",
"10. Implicit and fast validation\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Tech Overhead Investment\n",
"\n",
"1. Separation of causes prevents circular reference collisions, limits import bloat, and improves legibility/organization.\n",
" - Project card functionality in project card repo\n",
" - Separate modules for separate functionality\n",
"2. Reliability achieved through more testable code and expanding test coverage with more test cases and also anti-patterns\n",
" - Testable code\n",
" - Test coverage\n",
"3. Explicit data models that make complex data structures obvious and easy to validate\n",
" - Legible\n",
" - Self-documenting\n",
" - Flexible\n",
" - Validatable\n",
"4. Clean code principles that make code easier to test and maintain\n",
" - more functions that do a single thing...and do it well\n",
" - classes that are small - functions that access or manipulate them\n",
"5. Documentation\n",
" - consistent and detailed functional documentation\n",
" - consistent type hints\n",
" - usage for modules and classes\n",
" - overall documentation leverages the flexible and less bloated MkDocs package\n",
"6. Removes less-well-maintained dependencies\n",
" - Replaces Partridge with internal functionality\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Speed\n",
"\n",
"### I/O Speed\n",
"\n",
"1. Makes the heavy shapes.geojson optional and doesn't read it in for operations that don't directly involve it.\n",
"2. Leverages `pandera` for speedy, vector-based data model validation for dataframes\n",
"3. Replaces row-based calculations for blank geographic values with vector-based calculations\n",
"4. Provides flexibility for I/O serialization formats that are faster – like Parquet."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Selection Speed\n",
"\n",
"1. Caches selections and references them if network hasn’t changed using hashes so that costly selections that involve connecting a shortest path don't have to be performed again."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Project Apply Speed\n",
"\n",
"1. Replaced most row-based functions with vector-based functions - new road/managed lane\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setting roadway net speed"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Converet to model net"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Serialization Formats\n",
"\n",
"Multiple serialization formats and an API and script to translate.\n",
"- Parquet\n",
"- Geojson/json\n",
"- CSV\n",
"- Pickle"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Stability\n",
"\n",
"- Handles a lot more cases - has been tested on a lot more cases.\n",
"- If fails, should tell you why and what you need to do."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Transit Selection Features\n",
"\n",
"- Select by any trip or route characteristic\n",
"- Select by nodes or links"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## GeoDataFrames\n",
"\n",
"- All roadway tables are stored in GeoDataFrames for easy viewing\n",
"- Transit is easily transferred to GeoDataFrames"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Clipping\n",
"\n",
"- Easily clip roadway or transit features using API or script"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Code Complexity\n",
"\n",
"\n",
"\n",
"### Cyclomatic Complexity\n",
"\n",
"[Radon](https://radon.readthedocs.io/) \n",
"\n",
"> Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Maintainability\n",
"\n",
"### Maintainability Index\n",
"\n",
"[Radon](https://radon.readthedocs.io/) \n",
"\n",
"> Maintainability Index is a software metric which measures how maintainable (easy to support and change) the source code is. The maintainability index is calculated as a factored formula consisting of SLOC (Source Lines Of Code), Cyclomatic Complexity and Halstead volume."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

0 comments on commit 166bac7

Please sign in to comment.