diff --git a/.gitignore b/.gitignore index b465573..a949e52 100644 --- a/.gitignore +++ b/.gitignore @@ -14,3 +14,9 @@ __pycache__/ \.DS_Store notebooks/\.ipynb_checkpoints/ + +coverage.xml + +.coverage.* + +*.nc diff --git a/.travis.yml b/.travis.yml index 178e4c3..cb18064 100644 --- a/.travis.yml +++ b/.travis.yml @@ -61,6 +61,6 @@ install: - pip install -e . script: - pip install pytest pytest-cov coveralls -- pip install jupyter pandas plotnine holoviews terrainbento -- pytest --cov-report=xml:$(pwd)/coverage.xml +- pip install jupyter pandas plotnine holoviews terrainbento tqdm +- pytest umami tests/ --doctest-modules --cov=umami --cov-report=xml:$(pwd)/coverage.xml -vvv after_success: coveralls diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md index 0be77b5..75af011 100644 --- a/CODE_OF_CONDUCT.md +++ b/CODE_OF_CONDUCT.md @@ -21,7 +21,7 @@ Examples of unacceptable behavior by participants include: * The use of sexualized language or imagery and unwelcome sexual attention or advances * Trolling, insulting/derogatory comments, and personal or political attacks * Public or private harassment -* Publishing others" private information, such as a physical or electronic address, without explicit permission +* Publishing others' private information, such as a physical or electronic address, without explicit permission * Other conduct which could reasonably be considered inappropriate in a professional setting ## Our Responsibilities diff --git a/Makefile b/Makefile index 0f5327d..62d1aa2 100644 --- a/Makefile +++ b/Makefile @@ -70,10 +70,9 @@ coverage: ## check code coverage quickly with the default Python docs: ## generate Sphinx HTML documentation, including API docs rm -f docs/umami.rst rm -f docs/modules.rst - sphinx-apidoc -o docs/ umami $(MAKE) -C docs clean $(MAKE) -C docs html - $(BROWSER) docs/_build/html/index.html + $(BROWSER) docs/build/html/index.html install: clean ## install the package to the active Python's site-packages python setup.py develop diff --git a/README.md b/README.md index df2d1d5..838a33f 100644 --- a/README.md +++ b/README.md @@ -1,65 +1,108 @@ [![Documentation Status](https://readthedocs.org/projects/umami/badge/?version=latest)](https://umami.readthedocs.io/en/latest/?badge=latest) [![Build Status](https://travis-ci.org/TerrainBento/umami.svg?branch=master)](https://travis-ci.org/TerrainBento/umami) [![Build status](https://ci.appveyor.com/api/projects/status/0ehba569dttgsuyv?svg=true)](https://ci.appveyor.com/project/kbarnhart/umami) -[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/TerrainBento/umami/master) [![Coverage Status](https://coveralls.io/repos/github/TerrainBento/umami/badge.svg?branch=master)](https://coveralls.io/github/TerrainBento/umami?branch=master) [![Anaconda-Server Badge](https://anaconda.org/conda-forge/umami/badges/installer/conda.svg)](https://conda.anaconda.org/conda-forge) +[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/TerrainBento/umami/master?filepath=notebooks%2FWelcome.ipynb) -# Umami +# What is it? -Umami calculates topographic metrics for use in assessing model-data fit. +Umami is a package for calculating objective functions or objective function +components for Earth surface dynamics modeling. It was designed to work well +with +[terrainbento](https://github.com/TerrainBento/terrainbento) and other models built with the +[Landlab Toolkit](https://github.com/landlab/landlab). Examples can be +found in the `notebooks` directory (or on Binder +[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/TerrainBento/umami/master?filepath=notebooks%2FWelcome.ipynb) +). -It is presently under construction. +Umami offers two primary classes: +* a [`Residual`](https://umami.readthedocs.io/en/latest/umami.residual.html#Residual), +which represents the difference between model and data, and +* a [`Metric`](https://umami.readthedocs.io/en/latest/umami.metric.html), +which is a calculated value on either model or data. -There will eventually be a binder link here to some notebooks that provide introductory examples. +The set of currently supported calculations are found in the [`umami.calculations`](https://umami.readthedocs.io/en/latest/umami.calculations.html) submodule. -# Getting Help +# What does it do well? -Use the [Issues Page]() to ask questions and get help. Please search open and closed issues to identify if a question has already been asked. We welcome all questions. +Umami was designed to provide an input-file based interface for calculating +single-value landscape metrics for use in model analysis. This supports +reproducible analysis and systematic variation in metric construction. When +used with `terrainbento` one input file can describe the model run, and one +input file can describe the model assessment or model-data comparison. This +streamlines model analysis applications. Umami also provides multiple output +formats (YAML and Dakota), the latter of which is designed to interface with +Sandia National Laboratory's [Dakota package](https://dakota.sandia.gov). -# Contributing +To get a sense of how it is meant to be used, check out the +[notebooks on Binder](https://mybinder.org/v2/gh/TerrainBento/umami/master?filepath=notebooks%2FWelcome.ipynb) +and the [API documentation](https://umami.readthedocs.io/en/latest/). -User contributions are welcome pull requests on a development branch. Umami strives for a meaningful 100% code coverage, adherence to [PEP8](), low lint levels using [Flake8](), and [Black style](). Many of these standards are enforced by continuous integration tests and the development team will help you bring contributions into compliance. Please see the [Development Practices]() page for more information. +# Where to get it -# Installation instructions - -**WARNING: conda and pip distributions are not presently active** -Before installing umami you will need a python distribution. We recommend that you use the [Anaconda python distribution](https://www.anaconda.com/download/). Unless you have a specific reason to want Python 2.7 we strongly suggest that you install Python 3.7 (or the current 3.* version provided by Anaconda). - -To install the release version of umami (this is probably what you want) we support conda and pip package management. +To install the release version of umami (this is probably what you want) we +support [conda](https://anaconda.org/conda-forge/umami) and +[pip](https://pypi.org/project/umami/) package management. ## Using conda + Open a terminal and execute the following: ``` -conda config --add channels conda-forge -conda install umami +$ conda config --add channels conda-forge +$ conda install umami ``` ## Using pip + Open a terminal and execute the following: ``` -pip install umami +$ pip install umami ``` ## From source code -To install the umami source code version of umami we recommend creating a conda environment for umami. +The source code is housed on [GitHub](https://github.com/TerrainBento/umami). +To install the umami from source code we recommend creating a conda environment. ``` -git clone https://github.com/TerrainBento/umami.git -cd umami -conda env create -f environment-dev.yml -conda activate umami-dev -python setup.py install +$ git clone https://github.com/TerrainBento/umami.git +$ cd umami +$ conda env create -f environment-dev.yml +$ conda activate umami-dev +$ python setup.py install ``` -#### A note to developers +If you are interested in developing umami, please check out the +[development practices](https://umami.readthedocs.io/en/latest/development_practices.html) +page. + +# Read the documentation + +Documentation is housed on [ReadTheDocs](https://umami.readthedocs.io). + +# License + +[MIT](https://github.com/TerrainBento/umami/blob/master/LICENSE) + +# Report issues and get help + +Umami uses Github Issue as a single point of contact for users and developers. +To ask a question, report a bug, make a feature request, or to get in touch for +any reason, please make +[an Issue](https://github.com/TerrainBento/umami/issues). + +# Contribute to umami + +All contributions are welcome. -If you plan to develop with umami, please fork umami, clone the forked repository, and replace `python setup.py install` with `python setup.py develop`. We recommend developing new features on a development branch. +Contributors and maintainers to this project are are expected to abide the [Contributor Code of Conduct](https://github.com/TerrainBento/umami/blob/master/CODE_OF_CONDUCT.md). +# Cite umami -# How to cite +If you use umami in your research, please cite it. -This repository will be submitted to JOSS. +A Journal of Open Source Software submission is planned. When it is finalized, +a link and citation will be found here. diff --git a/_version.py b/_version.py deleted file mode 100644 index d544646..0000000 --- a/_version.py +++ /dev/null @@ -1,557 +0,0 @@ - -# This file helps to compute a version number in source trees obtained from -# git-archive tarball (such as those provided by githubs download-from-tag -# feature). Distribution tarballs (built by setup.py sdist) and build -# directories (produced by setup.py build) will contain a much shorter file -# that just contains the computed version number. - -# This file is released into the public domain. Generated by -# versioneer-0.18 (https://github.com/warner/python-versioneer) - -"""Git implementation of _version.py.""" - -import errno -import os -import re -import subprocess -import sys - - -def get_keywords(): - """Get the keywords needed to look up the version information.""" - # these strings will be replaced by git during git-archive. - # setup.py/versioneer.py will grep for the variable names, so they must - # each be defined on a line of their own. _version.py will just call - # get_keywords(). - git_refnames = "$Format:%d$" - git_full = "$Format:%H$" - git_date = "$Format:%ci$" - keywords = {"refnames": git_refnames, "full": git_full, "date": git_date} - return keywords - - -class VersioneerConfig: - """Container for Versioneer configuration parameters.""" - - -def get_config(): - """Create, populate and return the VersioneerConfig() object.""" - # these strings are filled in when 'setup.py versioneer' creates - # _version.py - cfg = VersioneerConfig() - cfg.VCS = "git" - cfg.style = "pep440" - cfg.tag_prefix = "v" - cfg.parentdir_prefix = "landlab-" - cfg.versionfile_source = "landlab/_version.py" - cfg.verbose = False - return cfg - - -class NotThisMethod(Exception): - """Exception raised if a method is not valid for the current scenario.""" - - -LONG_VERSION_PY = {} -HANDLERS = {} - - -def register_vcs_handler(vcs, method): # decorator - """Create decorator to mark a method as the handler of a VCS.""" - - def decorate(f): - """Store f in HANDLERS[vcs][method].""" - if vcs not in HANDLERS: - HANDLERS[vcs] = {} - HANDLERS[vcs][method] = f - return f - - return decorate - - -def run_command(commands, args, cwd=None, verbose=False, hide_stderr=False, env=None): - """Call the given command(s).""" - assert isinstance(commands, list) - p = None - for c in commands: - try: - dispcmd = str([c] + args) - # remember shell=False, so use git.cmd on windows, not just git - p = subprocess.Popen( - [c] + args, - cwd=cwd, - env=env, - stdout=subprocess.PIPE, - stderr=(subprocess.PIPE if hide_stderr else None), - ) - break - except EnvironmentError: - e = sys.exc_info()[1] - if e.errno == errno.ENOENT: - continue - if verbose: - print("unable to run %s" % dispcmd) - print(e) - return None, None - else: - if verbose: - print("unable to find command, tried %s" % (commands,)) - return None, None - stdout = p.communicate()[0].strip() - if sys.version_info[0] >= 3: - stdout = stdout.decode() - if p.returncode != 0: - if verbose: - print("unable to run %s (error)" % dispcmd) - print("stdout was %s" % stdout) - return None, p.returncode - return stdout, p.returncode - - -def versions_from_parentdir(parentdir_prefix, root, verbose): - """Try to determine the version from the parent directory name. - - Source tarballs conventionally unpack into a directory that includes both - the project name and a version string. We will also support searching up - two directory levels for an appropriately named parent directory - """ - rootdirs = [] - - for i in range(3): - dirname = os.path.basename(root) - if dirname.startswith(parentdir_prefix): - return { - "version": dirname[len(parentdir_prefix) :], - "full-revisionid": None, - "dirty": False, - "error": None, - "date": None, - } - else: - rootdirs.append(root) - root = os.path.dirname(root) # up a level - - if verbose: - print( - "Tried directories %s but none started with prefix %s" - % (str(rootdirs), parentdir_prefix) - ) - raise NotThisMethod("rootdir doesn't start with parentdir_prefix") - - -@register_vcs_handler("git", "get_keywords") -def git_get_keywords(versionfile_abs): - """Extract version information from the given file.""" - # the code embedded in _version.py can just fetch the value of these - # keywords. When used from setup.py, we don't want to import _version.py, - # so we do it with a regexp instead. This function is not used from - # _version.py. - keywords = {} - try: - f = open(versionfile_abs, "r") - for line in f.readlines(): - if line.strip().startswith("git_refnames ="): - mo = re.search(r'=\s*"(.*)"', line) - if mo: - keywords["refnames"] = mo.group(1) - if line.strip().startswith("git_full ="): - mo = re.search(r'=\s*"(.*)"', line) - if mo: - keywords["full"] = mo.group(1) - if line.strip().startswith("git_date ="): - mo = re.search(r'=\s*"(.*)"', line) - if mo: - keywords["date"] = mo.group(1) - f.close() - except EnvironmentError: - pass - return keywords - - -@register_vcs_handler("git", "keywords") -def git_versions_from_keywords(keywords, tag_prefix, verbose): - """Get version information from git keywords.""" - if not keywords: - raise NotThisMethod("no keywords at all, weird") - date = keywords.get("date") - if date is not None: - # git-2.2.0 added "%cI", which expands to an ISO-8601 -compliant - # datestamp. However we prefer "%ci" (which expands to an "ISO-8601 - # -like" string, which we must then edit to make compliant), because - # it's been around since git-1.5.3, and it's too difficult to - # discover which version we're using, or to work around using an - # older one. - date = date.strip().replace(" ", "T", 1).replace(" ", "", 1) - refnames = keywords["refnames"].strip() - if refnames.startswith("$Format"): - if verbose: - print("keywords are unexpanded, not using") - raise NotThisMethod("unexpanded keywords, not a git-archive tarball") - refs = set([r.strip() for r in refnames.strip("()").split(",")]) - # starting in git-1.8.3, tags are listed as "tag: foo-1.0" instead of - # just "foo-1.0". If we see a "tag: " prefix, prefer those. - TAG = "tag: " - tags = set([r[len(TAG) :] for r in refs if r.startswith(TAG)]) - if not tags: - # Either we're using git < 1.8.3, or there really are no tags. We use - # a heuristic: assume all version tags have a digit. The old git %d - # expansion behaves like git log --decorate=short and strips out the - # refs/heads/ and refs/tags/ prefixes that would let us distinguish - # between branches and tags. By ignoring refnames without digits, we - # filter out many common branch names like "release" and - # "stabilization", as well as "HEAD" and "master". - tags = set([r for r in refs if re.search(r"\d", r)]) - if verbose: - print("discarding '%s', no digits" % ",".join(refs - tags)) - if verbose: - print("likely tags: %s" % ",".join(sorted(tags))) - for ref in sorted(tags): - # sorting will prefer e.g. "2.0" over "2.0rc1" - if ref.startswith(tag_prefix): - r = ref[len(tag_prefix) :] - if verbose: - print("picking %s" % r) - return { - "version": r, - "full-revisionid": keywords["full"].strip(), - "dirty": False, - "error": None, - "date": date, - } - # no suitable tags, so version is "0+unknown", but full hex is still there - if verbose: - print("no suitable tags, using unknown + full revision id") - return { - "version": "0+unknown", - "full-revisionid": keywords["full"].strip(), - "dirty": False, - "error": "no suitable tags", - "date": None, - } - - -@register_vcs_handler("git", "pieces_from_vcs") -def git_pieces_from_vcs(tag_prefix, root, verbose, run_command=run_command): - """Get version from 'git describe' in the root of the source tree. - - This only gets called if the git-archive 'subst' keywords were *not* - expanded, and _version.py hasn't already been rewritten with a short - version string, meaning we're inside a checked out source tree. - """ - GITS = ["git"] - if sys.platform == "win32": - GITS = ["git.cmd", "git.exe"] - - out, rc = run_command(GITS, ["rev-parse", "--git-dir"], cwd=root, hide_stderr=True) - if rc != 0: - if verbose: - print("Directory %s not under git control" % root) - raise NotThisMethod("'git rev-parse --git-dir' returned error") - - # if there is a tag matching tag_prefix, this yields TAG-NUM-gHEX[-dirty] - # if there isn't one, this yields HEX[-dirty] (no NUM) - describe_out, rc = run_command( - GITS, - [ - "describe", - "--tags", - "--dirty", - "--always", - "--long", - "--match", - "%s*" % tag_prefix, - ], - cwd=root, - ) - # --long was added in git-1.5.5 - if describe_out is None: - raise NotThisMethod("'git describe' failed") - describe_out = describe_out.strip() - full_out, rc = run_command(GITS, ["rev-parse", "HEAD"], cwd=root) - if full_out is None: - raise NotThisMethod("'git rev-parse' failed") - full_out = full_out.strip() - - pieces = {} - pieces["long"] = full_out - pieces["short"] = full_out[:7] # maybe improved later - pieces["error"] = None - - # parse describe_out. It will be like TAG-NUM-gHEX[-dirty] or HEX[-dirty] - # TAG might have hyphens. - git_describe = describe_out - - # look for -dirty suffix - dirty = git_describe.endswith("-dirty") - pieces["dirty"] = dirty - if dirty: - git_describe = git_describe[: git_describe.rindex("-dirty")] - - # now we have TAG-NUM-gHEX or HEX - - if "-" in git_describe: - # TAG-NUM-gHEX - mo = re.search(r"^(.+)-(\d+)-g([0-9a-f]+)$", git_describe) - if not mo: - # unparseable. Maybe git-describe is misbehaving? - pieces["error"] = "unable to parse git-describe output: '%s'" % describe_out - return pieces - - # tag - full_tag = mo.group(1) - if not full_tag.startswith(tag_prefix): - if verbose: - fmt = "tag '%s' doesn't start with prefix '%s'" - print(fmt % (full_tag, tag_prefix)) - pieces["error"] = "tag '%s' doesn't start with prefix '%s'" % ( - full_tag, - tag_prefix, - ) - return pieces - pieces["closest-tag"] = full_tag[len(tag_prefix) :] - - # distance: number of commits since tag - pieces["distance"] = int(mo.group(2)) - - # commit: short hex revision ID - pieces["short"] = mo.group(3) - - else: - # HEX: no tags - pieces["closest-tag"] = None - count_out, rc = run_command(GITS, ["rev-list", "HEAD", "--count"], cwd=root) - pieces["distance"] = int(count_out) # total number of commits - - # commit date: see ISO-8601 comment in git_versions_from_keywords() - date = run_command(GITS, ["show", "-s", "--format=%ci", "HEAD"], cwd=root)[ - 0 - ].strip() - pieces["date"] = date.strip().replace(" ", "T", 1).replace(" ", "", 1) - - return pieces - - -def plus_or_dot(pieces): - """Return a + if we don't already have one, else return a .""" - if "+" in pieces.get("closest-tag", ""): - return "." - return "+" - - -def render_pep440(pieces): - """Build up version string, with post-release "local version identifier". - - Our goal: TAG[+DISTANCE.gHEX[.dirty]] . Note that if you - get a tagged build and then dirty it, you'll get TAG+0.gHEX.dirty - - Exceptions: - 1: no tags. git_describe was just HEX. 0+untagged.DISTANCE.gHEX[.dirty] - """ - if pieces["closest-tag"]: - rendered = pieces["closest-tag"] - if pieces["distance"] or pieces["dirty"]: - rendered += plus_or_dot(pieces) - rendered += "%d.g%s" % (pieces["distance"], pieces["short"]) - if pieces["dirty"]: - rendered += ".dirty" - else: - # exception #1 - rendered = "0+untagged.%d.g%s" % (pieces["distance"], pieces["short"]) - if pieces["dirty"]: - rendered += ".dirty" - return rendered - - -def render_pep440_pre(pieces): - """TAG[.post.devDISTANCE] -- No -dirty. - - Exceptions: - 1: no tags. 0.post.devDISTANCE - """ - if pieces["closest-tag"]: - rendered = pieces["closest-tag"] - if pieces["distance"]: - rendered += ".post.dev%d" % pieces["distance"] - else: - # exception #1 - rendered = "0.post.dev%d" % pieces["distance"] - return rendered - - -def render_pep440_post(pieces): - """TAG[.postDISTANCE[.dev0]+gHEX] . - - The ".dev0" means dirty. Note that .dev0 sorts backwards - (a dirty tree will appear "older" than the corresponding clean one), - but you shouldn't be releasing software with -dirty anyways. - - Exceptions: - 1: no tags. 0.postDISTANCE[.dev0] - """ - if pieces["closest-tag"]: - rendered = pieces["closest-tag"] - if pieces["distance"] or pieces["dirty"]: - rendered += ".post%d" % pieces["distance"] - if pieces["dirty"]: - rendered += ".dev0" - rendered += plus_or_dot(pieces) - rendered += "g%s" % pieces["short"] - else: - # exception #1 - rendered = "0.post%d" % pieces["distance"] - if pieces["dirty"]: - rendered += ".dev0" - rendered += "+g%s" % pieces["short"] - return rendered - - -def render_pep440_old(pieces): - """TAG[.postDISTANCE[.dev0]] . - - The ".dev0" means dirty. - - Exceptions: - 1: no tags. 0.postDISTANCE[.dev0] - """ - if pieces["closest-tag"]: - rendered = pieces["closest-tag"] - if pieces["distance"] or pieces["dirty"]: - rendered += ".post%d" % pieces["distance"] - if pieces["dirty"]: - rendered += ".dev0" - else: - # exception #1 - rendered = "0.post%d" % pieces["distance"] - if pieces["dirty"]: - rendered += ".dev0" - return rendered - - -def render_git_describe(pieces): - """TAG[-DISTANCE-gHEX][-dirty]. - - Like 'git describe --tags --dirty --always'. - - Exceptions: - 1: no tags. HEX[-dirty] (note: no 'g' prefix) - """ - if pieces["closest-tag"]: - rendered = pieces["closest-tag"] - if pieces["distance"]: - rendered += "-%d-g%s" % (pieces["distance"], pieces["short"]) - else: - # exception #1 - rendered = pieces["short"] - if pieces["dirty"]: - rendered += "-dirty" - return rendered - - -def render_git_describe_long(pieces): - """TAG-DISTANCE-gHEX[-dirty]. - - Like 'git describe --tags --dirty --always -long'. - The distance/hash is unconditional. - - Exceptions: - 1: no tags. HEX[-dirty] (note: no 'g' prefix) - """ - if pieces["closest-tag"]: - rendered = pieces["closest-tag"] - rendered += "-%d-g%s" % (pieces["distance"], pieces["short"]) - else: - # exception #1 - rendered = pieces["short"] - if pieces["dirty"]: - rendered += "-dirty" - return rendered - - -def render(pieces, style): - """Render the given version pieces into the requested style.""" - if pieces["error"]: - return { - "version": "unknown", - "full-revisionid": pieces.get("long"), - "dirty": None, - "error": pieces["error"], - "date": None, - } - - if not style or style == "default": - style = "pep440" # the default - - if style == "pep440": - rendered = render_pep440(pieces) - elif style == "pep440-pre": - rendered = render_pep440_pre(pieces) - elif style == "pep440-post": - rendered = render_pep440_post(pieces) - elif style == "pep440-old": - rendered = render_pep440_old(pieces) - elif style == "git-describe": - rendered = render_git_describe(pieces) - elif style == "git-describe-long": - rendered = render_git_describe_long(pieces) - else: - raise ValueError("unknown style '%s'" % style) - - return { - "version": rendered, - "full-revisionid": pieces["long"], - "dirty": pieces["dirty"], - "error": None, - "date": pieces.get("date"), - } - - -def get_versions(): - """Get version information or return default if unable to do so.""" - # I am in _version.py, which lives at ROOT/VERSIONFILE_SOURCE. If we have - # __file__, we can work backwards from there to the root. Some - # py2exe/bbfreeze/non-CPython implementations don't do __file__, in which - # case we can only use expanded keywords. - - cfg = get_config() - verbose = cfg.verbose - - try: - return git_versions_from_keywords(get_keywords(), cfg.tag_prefix, verbose) - except NotThisMethod: - pass - - try: - root = os.path.realpath(__file__) - # versionfile_source is the relative path from the top of the source - # tree (where the .git directory might live) to this file. Invert - # this to find the root from __file__. - for i in cfg.versionfile_source.split("/"): - root = os.path.dirname(root) - except NameError: - return { - "version": "0+unknown", - "full-revisionid": None, - "dirty": None, - "error": "unable to find root of source tree", - "date": None, - } - - try: - pieces = git_pieces_from_vcs(cfg.tag_prefix, root, verbose) - return render(pieces, cfg.style) - except NotThisMethod: - pass - - try: - if cfg.parentdir_prefix: - return versions_from_parentdir(cfg.parentdir_prefix, root, verbose) - except NotThisMethod: - pass - - return { - "version": "0+unknown", - "full-revisionid": None, - "dirty": None, - "error": "unable to compute version", - "date": None, - } diff --git a/appveyor.yml b/appveyor.yml index 220ab12..b8af098 100644 --- a/appveyor.yml +++ b/appveyor.yml @@ -28,7 +28,7 @@ install: - cmd: set PYTHONUNBUFFERED=1 - cmd: conda config --set always_yes yes - cmd: pip install pytest - - cmd: pip install jupyter pandas plotnine holoviews terrainbento + - cmd: pip install jupyter pandas plotnine holoviews terrainbento tqdm - cmd: conda info - cmd: conda list diff --git a/binder/postBuild b/binder/postBuild index 03cd060..a6176cb 100644 --- a/binder/postBuild +++ b/binder/postBuild @@ -1,9 +1,13 @@ #!/bin/bash -pip install jupyter -pip install holoviews -pip install pandas -pip install plotnine>=0.6.0 -pip install terrainbento>=1.1 +conda config --add channels conda-forge +conda config --set channel_priority strict -pip install -e . +conda install jupyter +conda install holoviews +conda install pandas +conda install plotnine>=0.6.0 +conda install terrainbento>=1.10.1 +conda install tqdm + +python setup.py install diff --git a/docs/source/development_practices.rst b/docs/source/development_practices.rst index 619ef73..a17d586 100644 --- a/docs/source/development_practices.rst +++ b/docs/source/development_practices.rst @@ -1,16 +1,76 @@ +.. _development_practices: + Development Practices ===================== +All contributions to umami are welcome. If you are considering developing with +umami, we recommend you read this page. All developers and maintainers are +expecte to abide by the `Code of Conduct`_. + +.. _Code of Conduct: https://github.com/TerrainBento/umami/blob/master/CODE_OF_CONDUCT.md + +Contribution Guidelines +----------------------- + +To contribute to umami, please fork the repository and make modifications on a +development branch. If you have any questions about how to get started, +`make an issue on GitHub`_. + +Umami strives for a meaningful 100% code coverage through docstring and unit +tests, adherence to `PEP8`_, low lint levels using `Flake8`_, and +`Black style`_. Many of these standards are enforced by continuous integration +tests and the development team will help you bring contributions into +compliance. + +.. _make an issue on GitHub: https://github.com/TerrainBento/umami/issues +.. _PEP8: https://www.python.org/dev/peps/pep-0008/ +.. _Black style: https://black.readthedocs.io/en/stable/ +.. _Flake8: http://flake8.pycqa.org/en/latest/# + +Developer Install instructions +------------------------------ + +After forking the repository, install using a conda environment.:: + + git clone https://github.com//umami.git + cd umami + conda env create -f environment-dev.yml + conda activate umami-dev + python setup.py develop + +You will need to run ``conda activate umami-dev`` each time you start a new +terminal session. + +When using this conda environment, you will have all of the tools you need to +format files and run the tests. + +Unit and Docstring Tests +------------------------ + +Umami uses `pytest`_ to discover and run tests of the codebase. We have two +types of tests that serve two complimentary purposes. Docstring tests are +examples within the code that provide context in the API documentation. Unit +tests live in in the ``test`` folder and provide additional tests that are +extraneous to the API documentation but important for ensuring that umami is +fully tested. The tests use small examples so that known answers can be used. + +.. _pytest: https://pytest.org/en/latest/ + +To run the tests, navigate to the top level ``umami`` directory and run:: + + pytest -Code style: pep8, Black, isort -make pretty +You can also assess how well the tests cover the code base.:: -Delint: Flake8 -make lint + pytest umami tests/ --doctest-modules --cov=umami --cov-report term-missing -vvv +Convenience Functions +--------------------- -Testing -meaningful 100% tests -pytest +There are a few convienience commands available through the Makefile to assist +with formatting and checking for lint. From the top level of the umami source +code directory you can type ``make pretty`` to format all the files. You can +type ``make lint`` to check for lint. Finally, the command ``make docs`` will +make the docs with `sphinx`_ and open the local copy in a web browser. -Test +.. _sphinx: https://www.sphinx-doc.org/en/master/ diff --git a/docs/source/index.rst b/docs/source/index.rst index 64b8a47..792cc47 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -4,55 +4,84 @@ Welcome to the Umami documentation! Introduction and Purpose ------------------------ + Umami is a package for calculating objective functions or objective function -components for landscape evolution modeling. TODO: add more purpose here. +components for landscape evolution modeling. It was designed to work well with +`terrainbento`_ and other models built with the `Landlab Toolkit`_. Examples +can be found in the notebooks on Binder (or in the ``notebooks`` directory). + +Umami offers two primary classes: a :py:class:`Residual`, +which represents the difference between model and data, and a +:py:class:`Metric`, +which is a calculated value on either model or data. The set of currently +supported calculations can be found in the +:py:mod:`umami.calculations` +submodule. + +Umami was designed to provide an input-file based interface for calculating +single-value landscape metrics for use in model analysis. This supports +reproducible analysis and systematic variation in metric construction. When +used with ``terrainbento``, one input file can describe the model run, and +another input file can describe the model assessment or model-data comparison. +This streamlines model analysis applications. Umami also provides multiple +output formats (YAML and Dakota), the latter of which is designed to interface +with Sandia National Laboratory's `Dakota package`_. + +The source code is housed on `GitHub`_. -Umami offers two primary classes a :py:class:`Residual` which represents the -difference between model and data, and :py:class:`Metric` which is a calculated -value on either model or data. A set of currently supported calculations are -found in the :py:mod:`umami.calculations` submodule. +.. _Landlab Toolkit: https://landlab.github.io +.. _terrainbento: https://terrainbento.readthedocs.io/en/latest/ +.. _GitHub: https://github.com/TerrainBento/umami +.. _Dakota package: https://dakota.sandia.gov -Umami is built on top of the `Landlab Toolkit`_ and designed to work well with -`terrainbento`_. +Getting Started +--------------- -TODO: put a link to the source code here. +Check out the example notebooks `housed on Binder`_ to get a sense of how umami +can be used. -.. _Landlab Toolkit: https://landlab.github.io -.. _terrainbento: https://terrainbento.readthedocs.io/en/latest/ +.. _housed on Binder: https://mybinder.org/v2/gh/TerrainBento/umami/master?filepath=notebooks%2FWelcome.ipynb -Organization of Software ------------------------- -.. toctree:: - :maxdepth: 1 +Get Help +-------- - umami.residual - umami.metric +To get help or ask questions, please make an `issue on GitHub`_. -.. toctree:: - :maxdepth: 1 +.. _issue on GitHub: https://github.com/TerrainBento/umami/issues - umami.calculations +Installation Instructions +------------------------- -Example Usage -------------- +Installation is described on `this`_ section of the README. + +.. _this: https://github.com/TerrainBento/umami#where-to-get-it + +Contributing +------------ + +All contributions are welcome. Make an `issue on GitHub`_ to file a bug report. + +If you are planning to contribute to the source code, please read the following +page for more information. .. toctree:: :maxdepth: 1 - umami.example_usage + development_practices -Get Help --------- -TODO +API Documentation +----------------- +.. toctree:: + :maxdepth: 1 -Installation Instructions -------------------------- -TODO + umami.residual + umami.metric -Contributing ------------- -TODO +.. toctree:: + :maxdepth: 1 + + umami.calculations Indices and tables ------------------ diff --git a/docs/source/umami.calculations.residual.discretized_misfit.rst b/docs/source/umami.calculations.residual.discretized_misfit.rst index 8480787..3f46390 100644 --- a/docs/source/umami.calculations.residual.discretized_misfit.rst +++ b/docs/source/umami.calculations.residual.discretized_misfit.rst @@ -1,5 +1,5 @@ -discretized_misfit: cal ------------------------------------------------------- +discretized_misfit: calculate a misfit discretized to patches of the topography +------------------------------------------------------------------------------- .. automodule:: umami.calculations.residual.discretized_misfit :members: diff --git a/docs/source/umami.calculations.rst b/docs/source/umami.calculations.rst index 399e4f1..c00fdfc 100644 --- a/docs/source/umami.calculations.rst +++ b/docs/source/umami.calculations.rst @@ -3,8 +3,6 @@ Calculations ============ -TODO - Calculations for Metrics or Residuals ------------------------------------- @@ -16,7 +14,6 @@ Calculations for Metrics or Residuals umami.calculations.metric.hypsometric_integral umami.calculations.metric.watershed_aggregation - Calculations for Residuals only ------------------------------- diff --git a/docs/source/umami.example_usage.rst b/docs/source/umami.example_usage.rst deleted file mode 100644 index 37c0391..0000000 --- a/docs/source/umami.example_usage.rst +++ /dev/null @@ -1,2 +0,0 @@ -Example Usage -============= diff --git a/docs/source/umami.metric.rst b/docs/source/umami.metric.rst index 64b1f4f..00d9914 100644 --- a/docs/source/umami.metric.rst +++ b/docs/source/umami.metric.rst @@ -1,3 +1,5 @@ +.. py:class:: Metric + Metric ====== diff --git a/environment-dev.yml b/environment-dev.yml index 6c2d3c2..f248652 100644 --- a/environment-dev.yml +++ b/environment-dev.yml @@ -12,10 +12,11 @@ dependencies: # for the package - scipy - numpy - - landlab>=1.10 + - landlab>=1.10.1 # for the notebooks - jupyter - holoviews - pandas + - tqdm - plotnine>=0.6.0 - terrainbento>=1.1 diff --git a/joss/paper.bib b/joss/paper.bib index 6efd402..83502a4 100644 --- a/joss/paper.bib +++ b/joss/paper.bib @@ -1 +1,101 @@ -# Bibliography for paper.md +@Article{hobley2017creative, + AUTHOR = {Hobley, D. E. J. and Adams, J. M. and Nudurupati, S. S. and Hutton, E. W. H. and Gasparini, N. M. and Istanbulluoglu, E. and Tucker, G. E.}, + TITLE = {Creative computing with Landlab: an open-source toolkit for building, coupling, and exploring two-dimensional numerical models of Earth-surface dynamics}, + JOURNAL = {Earth Surface Dynamics}, + VOLUME = {5}, + YEAR = {2017}, + NUMBER = {1}, + PAGES = {21--46}, + URL = {https://www.earth-surf-dynam.net/5/21/2017/}, + DOI = {10.5194/esurf-5-21-2017} + } + +@Article{barnhart2019terrain, + AUTHOR = {Barnhart, K. R. and Glade, R. C. and Shobe, C. M. and Tucker, G. E.}, + TITLE = {Terrainbento 1.0: a Python package for multi-model analysis in long-term drainage basin evolution}, + JOURNAL = {Geoscientific Model Development}, + VOLUME = {12}, + YEAR = {2019}, + NUMBER = {4}, + PAGES = {1267--1297}, + URL = {https://www.geosci-model-dev.net/12/1267/2019/}, + DOI = {10.5194/gmd-12-1267-2019} + } + +@Article{adams2017dakotatheory, + address = {Albuquerque, NM}, + author = {B.M. Adams and L.E. Bauman and W.J. Bohnhoff and K.R. + Dalbey and M.S. Ebeida and J.P. Eddy and M.S. Eldred and + P.D. Hough and K.T. Hu and J.D. Jakeman and J.A. Stephens + and L.P. Swiler and D.M. Vigil and T.M. Wildey}, + journal = {Sandia Technical Report SAND2014-4253}, + title = {Dakota, A Multilevel Parallel Object-Oriented Framework + for Design Optimization, Parameter Estimation, Uncertainty + Quantification, and Sensitivity Analysis: Version 6.6 + Theory Manual}, + year = {2017} + } + + +@Article{forte2019tak, + AUTHOR = {Forte, A. M. and Whipple, K. X.}, + TITLE = {Short communication: The Topographic Analysis Kit (TAK) for TopoToolbox}, + JOURNAL = {Earth Surface Dynamics}, + VOLUME = {7}, + YEAR = {2019}, + NUMBER = {1}, + PAGES = {87--95}, + URL = {https://www.earth-surf-dynam.net/7/87/2019/}, + DOI = {10.5194/esurf-7-87-2019} + } + +@Article{club2017geomorphometric, + AUTHOR = {Clubb, F. J. and Mudd, S. M. and Milodowski, D. T. and Valters, D. A. and Slater, L. J. and Hurst, M. D. and Limaye, A. B.}, + TITLE = {Geomorphometric delineation of floodplains and terraces from objectively defined topographic thresholds}, + JOURNAL = {Earth Surface Dynamics}, + VOLUME = {5}, + YEAR = {2017}, + NUMBER = {3}, + PAGES = {369--385}, + URL = {https://www.earth-surf-dynam.net/5/369/2017/}, + DOI = {10.5194/esurf-5-369-2017} + } + +@article{mudd2014statistical, + author = {Mudd, Simon M. and Attal, Mikaƫl and Milodowski, David T. and Grieve, Stuart W. D. and Valters, Declan A.}, + title = {A statistical framework to quantify spatial variation in channel gradients using the integral method of channel profile analysis}, + journal = {Journal of Geophysical Research: Earth Surface}, + volume = {119}, + number = {2}, + pages = {138-152}, + keywords = {bedrock channels, topographic analysis, landscape evolution}, + doi = {10.1002/2013JF002981}, + url = {https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1002/2013JF002981}, + year = {2014} + } + + +@Article{schwanghart2014topo, + AUTHOR = {Schwanghart, W. and Scherler, D.}, + TITLE = {Short Communication: TopoToolbox 2 - a MATLAB-based software for topographic analysis and modeling in Earth surface sciences}, + JOURNAL = {Earth Surface Dynamics}, + VOLUME = {2}, + YEAR = {2014}, + NUMBER = {1}, + PAGES = {1--7}, + URL = {https://www.earth-surf-dynam.net/2/1/2014/}, + DOI = {10.5194/esurf-2-1-2014} + } + + +@article{schwanghart2010topo, + title = "TopoToolbox: A set of Matlab functions for topographic analysis", + journal = "Environmental Modelling & Software", + volume = "25", + number = "6", + pages = "770 - 781", + year = "2010", + issn = "1364-8152", + doi = "https://doi.org/10.1016/j.envsoft.2009.12.002", + url = "http://www.sciencedirect.com/science/article/pii/S1364815209003053" + } diff --git a/joss/paper.md b/joss/paper.md index dec0b70..e218092 100644 --- a/joss/paper.md +++ b/joss/paper.md @@ -11,33 +11,53 @@ tags: - model analysis - model-data comparison - objective function - authors: - - name: Katherine R. Barnhart - orcid: 0000-0001-5682-455X - affiliation: 1, 2 - - name: Eric Hutton - orcid: 0000-0002-5864-6459 - affiliation: 3, 4 - - name: Gregory E. Tucker - orcid: 0000-0003-0364-5800 - affiliation: 1, 2, 3 - affiliations: - - name: University of Colorado at Boulder, Department of Geological Sciences - index: 1 - - name: University of Colorado at Boulder, Cooperative Institute for Research in Environmental Sciences - index: 2 - - name: University of Colorado at Boulder, Community Surface Dynamics Modeling System Integration Facility - index: 3 - - name: University of Colorado at Boulder, Institute for Arctic and Alpine Research - index: 4 - date: 19 August 2019 - bibliography: papers.bib - --- +authors: + - name: Katherine R. Barnhart + orcid: 0000-0001-5682-455X + affiliation: 1, 2 + - name: Eric Hutton + orcid: 0000-0002-5864-6459 + affiliation: 3, 4 + - name: Gregory E. Tucker + orcid: 0000-0003-0364-5800 + affiliation: 1, 2, 3 +affiliations: + - name: University of Colorado at Boulder, Department of Geological Sciences + index: 1 + - name: University of Colorado at Boulder, Cooperative Institute for Research in Environmental Sciences + index: 2 + - name: University of Colorado at Boulder, Community Surface Dynamics Modeling System Integration Facility + index: 3 + - name: University of Colorado at Boulder, Institute for Arctic and Alpine Research + index: 4 +date: 4 September 2019 +bibliography: papers.bib +--- # Summary +Models of Earth's surface dynamics are typically designed to simulate timescales that range from years to geologic epochs ($10^6$+ years). They represent and evolve a primary state variable, Earth's surface. Some applications may use other state variables (e.g., soil thickness). A diverse array of physical and chemical processes may be present. For example, the making and moving of water; the creation of soil and sediment from bedrock; the transport of mobile material due to hillslope processes, river erosion, and landslides; and the deposition of that material into the geologic archive. These models are used for applications as diverse as understanding the limit to the hight of mountain ranges and predicting the erosion, transport, and fate of contaminated material on human timescales. + +A diversity of data is used to compare models with observations. These data serve as "simulated equivalents", quantities with which to assess model performance. A common observable compared with model output is topography. High resolution topography, occasionally at two points in time, provides a dataset rich in the spatial dimension and sparse in the temporal. However, because model initial condition topography is not often known, we do not often expect modeled and observed topography to line up exactly. Thus, statistical derivatives of topography are often used. Other data sources (e.g., cosmogenic radionuclide derived erosion rates) provide time and space integrated measures. + +There is, however, no clear consensus regarding which set of simulated equivalents is most appropriate or most sucessful for assessing the performance of Earth surface dynamics models. This presents a challenge when this type of model is used in formal model analysis efforts. For example, calibration requires a formal objective function. Under these circumstances, there is a need for generic tools that facilitate exploration and usage of many plausibly successful simulated equivalents. This is the need that the umami package was designed to address. + +# Description of umami + +Umami is a package for calculating objective functions or objective function components for Earth surface dynamics modeling. It was designed to work well with models built with the Landlab Toolkit [@hobley2017creative] or with the `terrainbento` multi-model analysis [@barnhart2019terrain]. Umami is complementary to existing topographic analysis tools such as LSDTopoTools [@mudd2014statistical; @club2017geomorphometric], TopoToolbox [@schwanghart2010topo; @schwanghart2014topo] and the Topographic Analysis Kit [@forte2019tak]. Rather than performing topographic analysis, umami is used to distill model output into a form usable by model analysis methods such as sensitivity analysis, calibration, and validation. + +Umami offers two primary classes: a [`Residual`](https://umami.readthedocs.io/en/latest/umami.residual.html#Residual) +which represents the difference between model and data, and a [`Metric`](https://umami.readthedocs.io/en/latest/umami.metric.html) +which is a calculated value on either model or data. The set of currently supported calculations are found in the [`umami.calculations`](https://umami.readthedocs.io/en/latest/umami.calculations.html) submodule. Both the `Metric` and `Residual` classes are designed to be fully specified through a YAML-style input-file or python Dictionary interface. Many different calculations can be accomplished through parameter specification. This supports reproducible analysis and systematic variation in metric construction. For example, when used with `terrainbento` one input file can describe the model run, and one input file can describe the model assessment or model-data comparison. This streamlines model analysis applications by making driver files more re-usable and by placing the code that accomplished calculations in the umami package rather than within the driver file. Umami also provides multiple output formats (YAML and Dakota), the latter of which is designed to interface with Sandia National Laboratory's Dakota package [@adams2017dakotatheory]. + +The novel contribution of the umami package is not primarily found in the specific calculations accomplished (e.g., some of them are as straightforward as the mean of a state variable). Instead it is the flexible and extensible nature of the input file format and the `Metric` and `Residual` classes. Additionally, the package can be extended through the addition of new calculation methods. + +Two model-data comparison metrics in umami are novel. First, the `joint_density_misfit` provides a misfit metric of the joint density of two state variables. This comparison measure was inspired by the use of channel $\chi$ index value [@perron2013integral] and topographic elevation to assess river channel long profiles ($\chi$-z plots). While it is not clear how to best quantify the difference between a modeled and observed $\chi$-z plot, it is straightforward to calculate the sum of squared residuals between the joint density of $\chi$ and topographic elevation for a modeled and observed landscape. In umami, this comparison is generalized to the joint density of two state variables. + +Second, the `discretized_misfit` calculation seeks to reduce the dimension of a state variable misfit. In some applications it is appropriate to make a direct comparison between a measured and modeled state variable (e.g., difference measured and modeled topography). However, if an application uses a model domain with tens of thousands of grid cells a user is faced with a challenging choice: either treat each residual as an individual observation or reduce all residuals to a single value through a sum of squared residuals (or other norm). The `discretized_misfit` calculation permits the identification of consistent "categories" of grid cells based on other grid fields. A sum of squared residuals is then calculated within each of these categories. # Acknowledgements +Support for this work was provided by an NSF EAR Postdoctoral Fellowship to Barnhart (1725774). Funding for Landlab was provided by NSF (1147454, 1450409). Landlab is additionally supported by the Community Surface Dynamics Modeling System (1226297, 1831623). # References diff --git a/notebooks/Discretized Residual with Umami.ipynb b/notebooks/Discretized Residual with Umami.ipynb deleted file mode 100644 index 879f1a9..0000000 --- a/notebooks/Discretized Residual with Umami.ipynb +++ /dev/null @@ -1,289 +0,0 @@ -{ - "cells": [ - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import warnings\n", - "warnings.filterwarnings('ignore')\n", - "\n", - "from io import StringIO\n", - "from itertools import product\n", - "\n", - "import numpy as np\n", - "\n", - "import pandas as pd\n", - "\n", - "import matplotlib\n", - "import matplotlib.pyplot as plt\n", - "%matplotlib inline\n", - "\n", - "from plotnine import *\n", - "\n", - "import holoviews as hv\n", - "hv.notebook_extension('matplotlib')\n", - "\n", - "from landlab import imshow_grid\n", - "from terrainbento import Basic\n", - "from umami import Metric, Residual" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "spec_string = \"\"\"\n", - "# Create the Clock.\n", - "clock:\n", - " start: 0\n", - " step: 500\n", - " stop: {duration}\n", - "\n", - "# Create the Grid\n", - "grid: \n", - " RasterModelGrid: \n", - " - [100, 120]\n", - " - xy_spacing: 50\n", - " - fields: \n", - " node: \n", - " topographic__elevation:\n", - " random:\n", - " where: CORE_NODE\n", - "\n", - " \n", - "# Set up Boundary Handlers\n", - "boundary_handlers: \n", - " NotCoreNodeBaselevelHandler: \n", - " modify_core_nodes: True\n", - " lowering_rate: -{lowering_rate}\n", - "\n", - "# Parameters that control output.\n", - "output_interval: 1e3\n", - "save_first_timestep: True\n", - "output_prefix: \n", - " disc_resid.{name}.\n", - "fields: \n", - " - topographic__elevation\n", - "\n", - "# Parameters that control process and rates.\n", - "water_erodibility: {water_erodibility}\n", - "m_sp: 0.5\n", - "n_sp: 1.0\n", - "regolith_transport_parameter: 0.1\n", - "\"\"\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "truth_duration = 3e4\n", - "truth_water_erodibility = 0.0005\n", - "\n", - "lowering_rate = 100/truth_duration\n", - "\n", - "truth_params = StringIO(\n", - " spec_string.format(duration=truth_duration,\n", - " water_erodibility=truth_water_erodibility,\n", - " lowering_rate=lowering_rate, \n", - " name=\"truth\"))\n", - "np.random.seed(42)\n", - "truth = Basic.from_file(truth_params)\n", - "truth.run()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "ds = truth.to_xarray_dataset(time_unit='years', space_unit='meters')\n", - "hvds_topo = hv.Dataset(ds.topographic__elevation)\n", - "topo = hvds_topo.to(hv.Image, ['x', 'y'],\n", - " label='Truth').options(interpolation='bilinear',\n", - " cmap='viridis',\n", - " colorbar=True)\n", - "topo" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "ds.close()\n", - "truth.remove_output_netcdfs()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "residual_string = \"\"\"\n", - "dm:\n", - " _func: discretized_misfit\n", - " name: chi_{field_1_level}.z_{field_2_level}\n", - " misfit_field: topographic__elevation\n", - " field_1: channel__chi_index\n", - " field_2: topographic__elevation\n", - " field_1_percentile_edges:\n", - " - 0\n", - " - 30\n", - " - 60\n", - " - 100\n", - " field_2_percentile_edges:\n", - " - 0\n", - " - 50\n", - " - 60\n", - " - 100\n", - "\"\"\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "resolution = 10\n", - "durations = np.logspace(3, 5, num=resolution)\n", - "water_erodibilitys = np.logspace(-4, -2, num=resolution)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "out = {}\n", - "for i, (duration, water_erodibility) in enumerate(product(durations, water_erodibilitys)):\n", - " lowering_rate = 100/duration\n", - " test_params = StringIO(\n", - " spec_string.format(duration=duration, \n", - " water_erodibility=water_erodibility, \n", - " lowering_rate=lowering_rate, \n", - " name=i))\n", - " #np.random.seed(42)\n", - " test = Basic.from_file(test_params)\n", - " test.run()\n", - "\n", - " test.remove_output_netcdfs()\n", - " \n", - " residual = Residual(test.grid, truth.grid, chi_finder_kwds={\"min_drainage_area\": 1000})\n", - " residual.add_from_file(StringIO(residual_string))\n", - " residual.calculate()\n", - "\n", - " values = {name: residual.value(name) for name in residual.names}\n", - " out[(duration, water_erodibility)] = values" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "imshow_grid(truth.grid, residual.category, cmap=\"Dark2\", limits=(1,9))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "df = pd.DataFrame.from_dict(out, orient=\"index\")\n", - "df.index.names = [\"duration\", \"water_erodibility\"]\n", - "df.head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "df_melt = df.reset_index().melt(id_vars=[\"duration\", \"water_erodibility\"])\n", - "df_melt[\"squared_residual\"] = df_melt.value**2\n", - "df_melt.head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "variable_split = df_melt[\"variable\"].str.split(\".\", n = 1, expand = True)\n", - "df_melt[\"chi\"] = variable_split[0]\n", - "df_melt[\"elev\"] = variable_split[1]\n", - "df_melt.head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "p1 = (ggplot(df_melt,\n", - " aes(x=\"duration\", y=\"water_erodibility\", fill=\"squared_residual\")) +\n", - " geom_tile() + \n", - " geom_point(aes(x=truth_duration, y=truth_water_erodibility)) +\n", - " scale_fill_continuous(limits=[1,1000], trans=\"log10\") +\n", - " facet_grid(\"elev~chi\") + \n", - " theme_bw() + \n", - " scale_x_log10() +\n", - " scale_y_log10() +\n", - " coord_equal())\n", - "print(p1)\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.4" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} diff --git a/notebooks/DiscretizedMisfit.ipynb b/notebooks/DiscretizedMisfit.ipynb new file mode 100644 index 0000000..95bc436 --- /dev/null +++ b/notebooks/DiscretizedMisfit.ipynb @@ -0,0 +1,485 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Part 4: Application using the Discretized Misfit calculation\n", + "\n", + "\n", + "By now, you should have looked through [Part 1](IntroductionToMetric.ipynb) and [Part 2](IntroductionToResidual.ipynb) of the introductory notebook series. These introduced the umami `Metric` and `Residual` classes. You should have also worked through [Part 3](ExampleApplication.ipynb), which provides an example application of umami. \n", + "\n", + "## Scope\n", + "\n", + "Similar to [Part 3](ExampleApplication.ipynb), this application will use umami alongside the [terrainbento](https://terrainbento.readthedocs.io/en/latest/) package. As in the prior example, we will define a \"synthetic truth\" model evaluation with a specific set of input parameters, and then do a grid search letting some of those parameters vary. In this way we will explore which statistics for model-data comparison do best at identifying the \"true\" parameters. \n", + "\n", + "This application focuses on the least intuitive of the umami calculations: the [`discretized_misfit`](https://umami.readthedocs.io/en/latest/umami.calculations.residual.discretized_misfit.html). \n", + "\n", + "If you have comments or questions about the notebooks, the best place to get help is through [GitHub Issues](https://github.com/TerrainBento/umami/issues).\n", + "\n", + "To begin, we import necessary modules. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import warnings\n", + "warnings.filterwarnings('ignore')\n", + "\n", + "from io import StringIO\n", + "from itertools import product\n", + "from tqdm import tqdm\n", + "\n", + "import numpy as np\n", + "\n", + "import pandas as pd\n", + "\n", + "import matplotlib\n", + "import matplotlib.pyplot as plt\n", + "%matplotlib inline\n", + "\n", + "from plotnine import *\n", + "\n", + "import holoviews as hv\n", + "hv.notebook_extension('matplotlib')\n", + "\n", + "from landlab import imshow_grid\n", + "from terrainbento import Basic\n", + "from umami import Metric, Residual" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We begin by defining an input string that defines the terrainbento model.\n", + "\n", + "It evolves topography using stream power and linear diffusion. In our application it has a boundary condition of uniform uplift across the core of the model grid. It thus has the following governing equation:\n", + "\n", + "$\\frac{\\partial \\eta}{\\partial t} = U - KQ^{1/2}S + D\\nabla^2 \\eta$\n", + "\n", + "where $K$ and $D$ are constants, $Q$ is discharge, $S$ is local slope, $U$ is the uplift rate, and $\\eta$ is the topography. See the [model Basic documentation](https://terrainbento.readthedocs.io/en/latest/source/terrainbento.derived_models.model_basic.html) for additional information. \n", + "\n", + "In this input file we also indicate that the model will run with timesteps of 500 yr and the model grid will have a shape of (50, 80), with grid cell spacing of 100 m. For this application, the model initial condition is for core nodes to have random noise added.\n", + "\n", + "A few places in the input file have curly braces around a name. \n", + "* Two inputs parameters have curly brackets around them: `{duration}` and `{water_erodibility}`. These inputs will be modified using [`str.format`](https://docs.python.org/3/library/stdtypes.html#str.format) to set the \"truth\" model run and to vary the parameters in a grid search numerical experiment. \n", + "* We also format the `{name}` of output files in order to prevent Windows file permissions errors. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "spec_string = \"\"\"\n", + "# Create the Clock.\n", + "clock:\n", + " start: 0\n", + " step: 500\n", + " stop: {duration}\n", + "\n", + "# Create the Grid\n", + "grid: \n", + " RasterModelGrid: \n", + " - [100, 120]\n", + " - xy_spacing: 50\n", + " - fields: \n", + " node: \n", + " topographic__elevation:\n", + " random:\n", + " where: CORE_NODE\n", + "\n", + " \n", + "# Set up Boundary Handlers\n", + "boundary_handlers: \n", + " NotCoreNodeBaselevelHandler: \n", + " modify_core_nodes: True\n", + " lowering_rate: -{lowering_rate}\n", + "\n", + "# Parameters that control output.\n", + "output_interval: 1e3\n", + "save_first_timestep: True\n", + "output_prefix: \n", + " disc_resid.{name}.\n", + "fields: \n", + " - topographic__elevation\n", + "\n", + "# Parameters that control process and rates.\n", + "water_erodibility: {water_erodibility}\n", + "m_sp: 0.5\n", + "n_sp: 1.0\n", + "regolith_transport_parameter: 0.1\n", + "\"\"\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next we instantiate the \"truth\" model and run it. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "truth_duration = 3e4\n", + "truth_water_erodibility = 0.0005\n", + "\n", + "lowering_rate = 100 / truth_duration\n", + "\n", + "truth_params = StringIO(\n", + " spec_string.format(duration=truth_duration,\n", + " water_erodibility=truth_water_erodibility,\n", + " lowering_rate=lowering_rate,\n", + " name=\"truth\"))\n", + "np.random.seed(42)\n", + "truth = Basic.from_file(truth_params)\n", + "truth.run()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The [holoviews](https://holoviews.org) package provides capabilities to visualize the model run. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ds = truth.to_xarray_dataset(time_unit='years', space_unit='meters')\n", + "hvds_topo = hv.Dataset(ds.topographic__elevation)\n", + "topo = hvds_topo.to(hv.Image, ['x', 'y'],\n", + " label='Truth').options(interpolation='bilinear',\n", + " cmap='viridis',\n", + " colorbar=True)\n", + "topo" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As the center nodes are uplifted, a series of ridges and drainage basins form. This model has not yet reached \"topographic steady state\", in which $\\frac{\\partial \\eta}{\\partial t}>\\epsilon$ (and $\\epsilon$ is small) everywhere. \n", + "\n", + "Before moving on, we close the xarray dataset and remove the output netcdf files. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ds.close()\n", + "truth.remove_output_netcdfs()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 2: Define the basis for model-data comparison\n", + "\n", + "The discretized_misfit takes the following parameters:\n", + "\n", + " Parameters\n", + " ----------\n", + " model_grid : Landlab model grid\n", + " data_grid : Landlab model grid\n", + " name : str\n", + " misfit_field : str\n", + " field_1 : str\n", + " field_2 : str\n", + " field_1_percentile_edges : list\n", + " field_2_percentile_edges : list\n", + " \n", + "This calculation first classifies each grid cell in the landscape into categories based on `field_1`, `field_2` and the percentile edges for each (using the data grid). This results in a set of categories, which may or may not be contiguous in space. \n", + "\n", + "For each category, the sum of squared residuals is calculated based on the misfit_field.\n", + "\n", + "Since this calculation returns one value for each category, rather than one value in total, a `name` must be provided. This is a string that will be formatted with the values for `{field_1_level}` and `{field_2_level}`. The output is an ordered dictionary with name as the keys, and the sum of squares misfit as the values.\n", + "\n", + "The following is the input file (as string) needed to specify a `discretized_misfit` in which the domain is discretized based on `channel__chi_index` (three percentile levels defined by `[0, 30, 60, 100]`), and `topographic__elevation` (two percentile levels defined by `[0, 50, 100]`). \n", + "\n", + "Within each of the six category domains, a misfit is made based on the field `topographic__elevation`. \n", + "\n", + "Below we will show a plot indicating where each category is located, but first we will specify the numerical experiment." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "residual_string = \"\"\"\n", + "dm:\n", + " _func: discretized_misfit\n", + " name: chi_{field_1_level}.z_{field_2_level}\n", + " misfit_field: topographic__elevation\n", + " field_1: channel__chi_index\n", + " field_2: topographic__elevation\n", + " field_1_percentile_edges:\n", + " - 0\n", + " - 30\n", + " - 60\n", + " - 100\n", + " field_2_percentile_edges:\n", + " - 0\n", + " - 50\n", + " - 100\n", + "\"\"\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 3: Create and run the grid search experiment\n", + "\n", + "We will use a grid search to highlight how the misfit values in the `discretized_residual` calculated by umami vary across parameter space. \n", + "\n", + "We consider values for `duration` between $10^{3}$ and $10^{5}$ and values for $K$ (`water_erodibility`) between $10^{-4}$ and $10^{-2}$.\n", + "\n", + "With a resolution of 10, we evaluate $10^2=100$ simulations. Feel free to change the resolution value, though note that it will impact the run time of this notebook. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "resolution = 10\n", + "durations = np.logspace(3, 5, num=resolution)\n", + "water_erodibilitys = np.logspace(-4, -2, num=resolution)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We evaluate each pair of duration and water erodability and save the model output as a dictionary. With the line \n", + "\n", + " #np.random.seed(42)\n", + "\n", + "commented out, each evaluation uses a different random seed. Feel free to uncomment this line to see how the results change if the *exact same* random seed is used for each model integration. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "out = {}\n", + "for i, (duration,\n", + " water_erodibility) in enumerate(tqdm(product(durations,\n", + " water_erodibilitys))):\n", + " lowering_rate = 100 / duration\n", + " test_params = StringIO(\n", + " spec_string.format(duration=duration,\n", + " water_erodibility=water_erodibility,\n", + " lowering_rate=lowering_rate,\n", + " name=i))\n", + " #np.random.seed(42)\n", + " test = Basic.from_file(test_params)\n", + " test.run()\n", + "\n", + " test.remove_output_netcdfs()\n", + "\n", + " residual = Residual(test.grid,\n", + " truth.grid,\n", + " chi_finder_kwds={\"min_drainage_area\": 1000})\n", + " residual.add_from_file(StringIO(residual_string))\n", + " residual.calculate()\n", + "\n", + " values = {name: residual.value(name) for name in residual.names}\n", + " out[(duration, water_erodibility)] = values" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Before looking at the results, let's inspect the residual class. \n", + "\n", + "The property `residual.names` has a name for each of the six categories. The temporary strings `{field_1_level}` and `{field_2_level}` have been replaced with actual levels. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "residual.names" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The property `residual.values` has one value for each of the six categories. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "residual.values" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can plot the category using the `residual.category` property. Here each category gets its own panel. For example, the leftmost column represents the cells with `channel__chi_index` values in the lower 30%. The upper left panel is those that have the lower half of elevation values. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fig, axes = plt.subplots(2, 3, dpi=150)\n", + "\n", + "for i, name in enumerate(residual.names):\n", + " col, row = np.unravel_index(i, (3,2))\n", + " plt.sca(axes[row, col])\n", + " imshow_grid(truth.grid, residual.category==(i+1), cmap=\"cividis\", allow_colorbar=False)\n", + " plt.title(name)\n", + " plt.xticks([])\n", + " plt.yticks([])\n", + " plt.xlabel(None)\n", + " plt.ylabel(None)\n", + "\n", + "plt.tight_layout()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next we compile the output into a pandas dataframe and inspect. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "df = pd.DataFrame.from_dict(out, orient=\"index\")\n", + "df.index.names = [\"duration\", \"water_erodibility\"]\n", + "df.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Similar to the earlier notebook, we melt this dataframe in order to plot it. We also rename the column \"value\" to \"sum_of_squared_residuals\", as this is what `discretized_misfit` calculates. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "df_melt = df.reset_index().melt(id_vars=[\"duration\", \"water_erodibility\"])\n", + "df_melt = df_melt.rename(columns={\"value\": \"sum_of_squared_residuals\"})\n", + "df_melt.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We also use the [string.split](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.split.html) pandas function in order to turn our variable name into two columns" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "variable_split = df_melt[\"variable\"].str.split(\".\", n=1, expand=True)\n", + "df_melt[\"chi\"] = variable_split[0]\n", + "df_melt[\"elev\"] = variable_split[1]\n", + "df_melt.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally we plot the squared residual as a function:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "p1 = (ggplot(df_melt,\n", + " aes(x=\"duration\", y=\"water_erodibility\",\n", + " fill=\"sum_of_squared_residuals\")) + geom_tile() +\n", + " geom_point(aes(x=truth_duration, y=truth_water_erodibility)) +\n", + " scale_fill_continuous(limits=[1, 100], trans=\"log10\") +\n", + " facet_grid(\"elev~chi\") + theme_bw() + scale_x_log10() + scale_y_log10() +\n", + " coord_equal())\n", + "print(p1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can see that the misfit function has a similar shape in `water_erodibility`-`duration` space as the misfit functions from the [prior example notebook](ExampleApplication.ipynb). However, you can see that the specific location of the misfit surface minimum varies in its location in parameter space depending on which of the discretized residual categories is examined. \n", + "\n", + "# Next Steps\n", + "Congratulations on finishing the four part series of introductory notebooks. The next steps are to use umami in your own application. " + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/notebooks/ExampleApplication.ipynb b/notebooks/ExampleApplication.ipynb new file mode 100644 index 0000000..e8d7484 --- /dev/null +++ b/notebooks/ExampleApplication.ipynb @@ -0,0 +1,437 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Part 3: Example application\n", + "\n", + "By now, you should have looked through [Part 1](IntroductionToMetric.ipynb) and [Part 2](IntroductionToResidual.ipynb) of the introductory notebook series. These introduced the umami `Metric` and `Residual` classes. \n", + "\n", + "## Scope\n", + "\n", + "In this application we will use umami alongside the [terrainbento](https://terrainbento.readthedocs.io/en/latest/) package. Terrainbento will be used to define a landscape evolution model, the details of which will be defined below. \n", + "\n", + "We will define a \"synthetic truth\" model evaluation with a specific set of input parameters, and then do a grid search in which we let two of those parameters vary. In this way we will explore which statistics for model-data comparison do best at identifying the \"true\" parameters. \n", + "\n", + "If you have comments or questions about the notebooks, the best place to get help is through [GitHub Issues](https://github.com/TerrainBento/umami/issues).\n", + "\n", + "To begin, we import necessary modules. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import warnings\n", + "warnings.filterwarnings('ignore')\n", + "\n", + "from io import StringIO\n", + "from itertools import product\n", + "from tqdm import tqdm\n", + "\n", + "import numpy as np\n", + "\n", + "import pandas as pd\n", + "\n", + "import matplotlib\n", + "import matplotlib.pyplot as plt\n", + "%matplotlib inline\n", + "\n", + "from plotnine import *\n", + "\n", + "import holoviews as hv\n", + "hv.notebook_extension('matplotlib')\n", + "\n", + "from landlab import imshow_grid\n", + "from terrainbento import Basic\n", + "from umami import Metric, Residual" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 1: Define the truth model\n", + "\n", + "We begin by defining an input string that defines the terrainbento model. We will use the simplest terrainbento model, called Basic. \n", + "\n", + "It evolves topography using stream power and linear diffusion and has the following governing equation:\n", + "\n", + "$\\frac{\\partial \\eta}{\\partial t} = -KQ^{1/2}S + D\\nabla^2 \\eta$\n", + "\n", + "where $K$ and $D$ are parameters, $Q$ is discharge, $S$ is local slope (positive downward), and $\\eta$ is the topography. See the [model Basic documentation](https://terrainbento.readthedocs.io/en/latest/source/terrainbento.derived_models.model_basic.html) for additional information. \n", + "\n", + "In this input file we also indicate that the model will run with timesteps of 500 yr and the model grid will have a shape of (50, 80), with grid cell spacing of 100 m. The input file specifies that the model initial condition has all nodes set at an elevation of 100 m, with random noise added to the core nodes. During the model run, the boundary conditions are set to have node number 40 drop at a constant rate over the duration of the model run. This node will drop a total of 100 m over the course of the simulation. \n", + "\n", + "Note that a few places in the input file have curly braces around a name. These are as follows:\n", + "* Two inputs parameters, `{duration}` and `{water_erodibility}`, are modified using [`str.format`](https://docs.python.org/3/library/stdtypes.html#str.format). In this way we set the values for the \"truth\" model run and vary the parameters in a grid search numerical experiment.\n", + "* We set the `{lowering_rate}` based on the value for duration so that 100 m of lowering occurs during the simulation duration. \n", + "* We also format the `{name}` of output files in order to prevent Windows file permissions errors. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "spec_string = \"\"\"\n", + "# Create the Clock.\n", + "clock:\n", + " start: 0\n", + " step: 500\n", + " stop: {duration}\n", + "\n", + "# Create the Grid\n", + "grid: \n", + " RasterModelGrid: \n", + " - [50, 80]\n", + " - xy_spacing: 100\n", + " - fields: \n", + " node: \n", + " topographic__elevation:\n", + " random:\n", + " where: CORE_NODE\n", + " constant:\n", + " value: 100\n", + " \n", + "# Set up Boundary Handlers\n", + "boundary_handlers: \n", + " SingleNodeBaselevelHandler: \n", + " outlet_id: 40\n", + " lowering_rate: -{lowering_rate}\n", + "\n", + "# Parameters that control output.\n", + "output_interval: 1e3\n", + "save_first_timestep: True\n", + "output_prefix: \n", + " simple_application.{name}.\n", + "fields: \n", + " - topographic__elevation\n", + "\n", + "# Parameters that control process and rates.\n", + "water_erodibility: {water_erodibility}\n", + "m_sp: 0.5\n", + "n_sp: 1.0\n", + "regolith_transport_parameter: 0.1\n", + "\"\"\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next we instantiate the \"truth\" model and run it. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "truth_duration = 1e4\n", + "truth_water_erodibility = 0.0005\n", + "\n", + "lowering_rate = 100 / truth_duration\n", + "\n", + "truth_params = StringIO(\n", + " spec_string.format(duration=truth_duration,\n", + " water_erodibility=truth_water_erodibility,\n", + " lowering_rate=lowering_rate,\n", + " name=\"truth\"))\n", + "np.random.seed(42)\n", + "truth = Basic.from_file(truth_params)\n", + "truth.run()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The [holoviews](https://holoviews.org) package provides capabilities to visualize the model run. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ds = truth.to_xarray_dataset(time_unit='years', space_unit='meters')\n", + "hvds_topo = hv.Dataset(ds.topographic__elevation)\n", + "topo = hvds_topo.to(hv.Image, ['x', 'y'],\n", + " label='Truth').options(interpolation='bilinear',\n", + " cmap='viridis',\n", + " colorbar=True)\n", + "topo" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can see that in this model a drainage basin incises into the 100m high topography. This makes sense as we have dropped the elevation of node 40 by 100 m over the simulation. \n", + "\n", + "Before moving on, we close the xarray dataset and remove the output netcdf files. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ds.close()\n", + "truth.remove_output_netcdfs()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 2: Define the basis for model-data comparison\n", + "\n", + "We consider six different statistics for model data comparison, each defined in the following code block (which serves as our input file):\n", + "\n", + "* z_me : the mean of `topographic__elevation`.\n", + "* z_p10 : the 10th percentile of `topographic__elevation`.\n", + "* z_wsmean : the mean of `topographic__elevation` *within* the watershed that drains to node 40.\n", + "* ksw_z : the Kolmogorov-Smirnov test statistic for `topographic__elevation` *within* the watershed that drains to node 40.\n", + "* ksw_da : the Kolmogorov-Smirnov test statistic for `drainage_area` *within* the watershed that drains to node 40.\n", + "* ksw_s : the Kolmogorov-Smirnov test statistic for `topographic__steepest_slope` *within* the watershed that drains to node 40.\n", + "\n", + "Consider reading the API documentation for the [kstest_watershed](https://umami.readthedocs.io/en/latest/umami.calculations.residual.ks_test.html#umami.calculations.residual.kstest.kstest_watershed) calculation." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "residual_string = \"\"\"\n", + "z_me:\n", + " _func: aggregate\n", + " method: mean\n", + " field: topographic__elevation\n", + "z_p10:\n", + " _func: aggregate\n", + " method: percentile\n", + " field: topographic__elevation\n", + " q: 10\n", + "z_wsmean:\n", + " _func: watershed_aggregation\n", + " field: topographic__elevation\n", + " method: mean\n", + " outlet_id: 40\n", + "ksw_z:\n", + " _func: kstest_watershed\n", + " outlet_id: 40\n", + " field: topographic__elevation\n", + "ksw_da:\n", + " _func: kstest_watershed\n", + " outlet_id: 40\n", + " field: drainage_area\n", + "ksw_s:\n", + " _func: kstest_watershed\n", + " outlet_id: 40\n", + " field: topographic__steepest_slope\n", + "\"\"\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 3: Create and run the grid search experiment\n", + "\n", + "In this example, we will use a grid search to highlight how the misfit values calculated by umami vary across parameter space. \n", + "\n", + "We consider values for `duration` between $10^{3}$ and $10^{5}$ and values for $K$ (`water_erodibility`) between $10^{-4}$ and $10^{-2}$.\n", + "\n", + "With a resolution of 10, we evaluate $10^2=100$ simulations. Feel free to change the resolution value, though note that it will impact the run time of this notebook. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "resolution = 10\n", + "durations = np.logspace(3, 5, num=resolution)\n", + "water_erodibilitys = np.logspace(-4, -2, num=resolution)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We evaluate each pair of duration and water erodability and save the model output as a dictionary. With the line \n", + "\n", + " #np.random.seed(42)\n", + "\n", + "commented out, each evaluation uses a different random seed. Feel free to uncomment this line to see how the results change if the *exact same* random seed is used for each model integration. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "out = {}\n", + "for i, (duration,\n", + " water_erodibility) in enumerate(tqdm(product(durations,\n", + " water_erodibilitys))):\n", + " lowering_rate = 100 / duration\n", + " test_params = StringIO(\n", + " spec_string.format(duration=duration,\n", + " water_erodibility=water_erodibility,\n", + " lowering_rate=lowering_rate,\n", + " name=i))\n", + " #np.random.seed(42)\n", + " test = Basic.from_file(test_params)\n", + " test.run()\n", + "\n", + " test.remove_output_netcdfs()\n", + "\n", + " residual = Residual(test.grid, truth.grid)\n", + " residual.add_from_file(StringIO(residual_string))\n", + " residual.calculate()\n", + "\n", + " values = {name: residual.value(name) for name in residual.names}\n", + " out[(duration, water_erodibility)] = values" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 4: Compile output, inspect, and plot\n", + "\n", + "Next we will convert the output into a [pandas](http://pandas.pydata.org) dataframe and inspect it. The dataframe has two indices, the `duration` and `water_erodibility`. It has six columns, one each for the six outputs we defined above. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "df = pd.DataFrame.from_dict(out, orient=\"index\")\n", + "df.index.names = [\"duration\", \"water_erodibility\"]\n", + "df.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In order to plot the results easily, we will use [plotnine](http://plotnine.readthedocs.io), which provides a [ggplot](http://ggplot2.tidyverse.org) implementation in python. We will also need to [melt](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.melt.html) the dataframe from wide format to long format. \n", + "\n", + "After doing this, and inspecting, we can see that we now have a column for `duration`, `water_erodibility`, the output variable, its value, and the associated squared residual. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "df_melt = df.reset_index().melt(id_vars=[\"duration\", \"water_erodibility\"])\n", + "df_melt[\"squared_residual\"] = df_melt.value**2\n", + "df_melt.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We will make two plots, the first of which plots the three Kolmogorov-Smirnov test statistics. The white dot indicates the location of the \"truth\". \n", + "\n", + "You can see that there is a zone of low misfit in the region of the truth parameters, but that good fits can be found elsewhere. We can also see that there is correlation between `water_erodability` and `duration`. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "p1 = (ggplot(df_melt[df_melt.variable.str.startswith(\"ksw\")],\n", + " aes(x=\"duration\", y=\"water_erodibility\",\n", + " fill=\"squared_residual\")) + geom_tile() +\n", + " geom_point(aes(x=truth_duration, y=truth_water_erodibility)) +\n", + " scale_fill_continuous(limits=[0.001, 1], trans=\"log10\") +\n", + " facet_wrap(\"~variable\") + theme_bw() + scale_x_log10() +\n", + " scale_y_log10() + coord_equal())\n", + "print(p1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally we plot the three statistics that relate to the topographic elevation. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "p2 = (\n", + " ggplot(df_melt[df_melt.variable.str.startswith(\"z\")],\n", + " aes(x=\"duration\", y=\"water_erodibility\", fill=\"squared_residual\")) +\n", + " geom_tile() + scale_fill_continuous(limits=[0.001, 1000], trans=\"log10\") +\n", + " geom_point(aes(x=truth_duration, y=truth_water_erodibility)) +\n", + " facet_wrap(\"~variable\") + theme_bw() + scale_x_log10() + scale_y_log10() +\n", + " coord_equal())\n", + "print(p2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Next steps\n", + "\n", + "The next step is the final notebook in the four part introductory series: [Part 4: Application using the Discretized Misfit calculation](DiscretizedMisfit.ipynb)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/notebooks/IntroductionToMetric.ipynb b/notebooks/IntroductionToMetric.ipynb new file mode 100644 index 0000000..ed520e8 --- /dev/null +++ b/notebooks/IntroductionToMetric.ipynb @@ -0,0 +1,381 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Part 1: Introduction to Umami and the `Metric` Class\n", + "\n", + "Umami is a package for calculating metrics for use with for Earth surface dynamics models. This notebook is the first half of a two-part introduction to using umami.\n", + "\n", + "Umami was designed to work well with the [terrainbento](https://terrainbento.readthedocs.io/en/latest/) model package, as well as other models built using the [Landlab Toolkit](https://github.com/landlab/landlab). \n", + "\n", + "## Scope of this tutorial\n", + "\n", + "In this tutorial you will learn the basic principles behind the umami package and use the `Metric` class to calculate terrain statistics. \n", + "\n", + "If you have comments or questions about the notebooks, the best place to get help is through [GitHub Issues](https://github.com/TerrainBento/umami/issues).\n", + "\n", + "To begin this example, we will import the required python packages. " + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "ename": "ModuleNotFoundError", + "evalue": "No module named 'umami'", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mio\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mStringIO\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mlandlab\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mRasterModelGrid\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mimshow_grid\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 6\u001b[0;31m \u001b[0;32mfrom\u001b[0m \u001b[0mumami\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mMetric\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 7\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mumami\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcalculations\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0maggregate\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'umami'" + ] + } + ], + "source": [ + "import warnings\n", + "warnings.filterwarnings('ignore')\n", + "\n", + "from io import StringIO\n", + "from landlab import RasterModelGrid, imshow_grid\n", + "from umami import Metric\n", + "from umami.calculations import aggregate" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 1: Create an example grid\n", + "First we will create an example grid and add a field called `topographic__elevation` to it. The grid shown here has a shape of (10x10) and slopes to the south-west.\n", + "\n", + "The Landlab grid is used as the core data structure for umami as it provides important information about grid size, shape, adjacency. It, however, is very flexible as it is compatible with regular and irregular grids and can be specified with a numpy array. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "grid = RasterModelGrid((10, 10))\n", + "z = grid.add_zeros(\"node\", \"topographic__elevation\")\n", + "z += grid.x_of_node + grid.y_of_node\n", + "\n", + "imshow_grid(grid, z)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 2: Use the `umami.Metric` class\n", + "\n", + "Umami was designed to facilitate the calculation of topographic metrics for use in model analysis. For example, say one wanted to assess a model's performance based on the mean topography. On one hand, this can simply be done by evaluating `np.mean(z)` and writing the results out to an output file. \n", + "\n", + "However, if the set of metrics becomes more complex, then extensive analysis must occur in the driver script or function used to run the model and produce model analysis output. If one wanted to take multiple aggreggation statistics on a single field (e.g., `topographic__elevation`) and to also calculate other metrics, then using the umami package becomes competitive relative to hand-calculating a few metrics. \n", + "\n", + "To use the umami [`Metric`](https://umami.readthedocs.io/en/latest/umami.metric.html) class we need to pass a grid, such as that defined above, as well as a dictionary indicating what metrics we want calculated. Metrics can be added after instantiation, and can also be added using an input file in [YAML](https://yaml.org) format. \n", + "\n", + "Each calculation gets its own unique name (the key in the dictionary), and is associated with a value, a dictionary specifying exactly what should be calculated. The only value of the dictionary required by all umami calculations is `_func`, which indicates which of the [`umami.calculations`](https://umami.readthedocs.io/en/latest/umami.calculations.html) will be performed. Subsequent elements of this dictionary are the required inputs to the calculation function and are described in their documentation. \n", + "\n", + "Note that some calculations listed in the [`umami.calculations`](https://umami.readthedocs.io/en/latest/umami.calculations.html) submodule are valid for both the umami `Metric` and `Residual` classes, while others are for `Residual`s only (the `Residual` will be covered in [Part 2](IntroductionToResidual.ipynb) of this notebook series). \n", + "\n", + "The order that calculations are listed is read in as an [OrderedDict](https://docs.python.org/3/library/collections.html#collections.OrderedDict) and retained as the \"calculation order\". \n", + "\n", + "In our example we will use the following dictionary: \n", + "\n", + "```python\n", + "metrics = {\n", + " \"me\": {\n", + " \"_func\": \"aggregate\",\n", + " \"method\": \"mean\",\n", + " \"field\": \"topographic__elevation\"\n", + " },\n", + " \"ep10\": {\n", + " \"_func\": \"aggregate\",\n", + " \"method\": \"percentile\",\n", + " \"field\": \"topographic__elevation\",\n", + " \"q\": 10\n", + " }\n", + "}\n", + "```\n", + "This specifies calculation of the mean of `topographic__elevation` (to be called \"me\") and the 10th percentile `topographic__elevation` (called \"ep10\"). The equivalent portion of a YAML input file would look like:\n", + "\n", + "```yaml\n", + "metrics:\n", + " me:\n", + " _func: aggregate\n", + " method: mean\n", + " field: topographic__elevation\n", + " ep10:\n", + " _func: aggregate\n", + " method: percentile\n", + " field: topographic__elevation\n", + " q: 10\n", + "```\n", + "\n", + "Next, let's construct the Metric class. It is important to note that, the umami [`aggregate`](https://umami.readthedocs.io/en/latest/umami.calculations.metric.aggregate.html) function only operates on the core nodes of the model grid. Use the link above to read the API documentation that describes in detail what values the `aggreggate` calculation expects. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "metrics = {\n", + " \"me\": {\n", + " \"_func\": \"aggregate\",\n", + " \"method\": \"mean\",\n", + " \"field\": \"topographic__elevation\"\n", + " },\n", + " \"ep10\": {\n", + " \"_func\": \"aggregate\",\n", + " \"method\": \"percentile\",\n", + " \"field\": \"topographic__elevation\",\n", + " \"q\": 10\n", + " }\n", + "}\n", + "\n", + "metric = Metric(grid, metrics=metrics)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The umami `Metric` does not calculate the metrics when it is instantiated. During instantiation, umami performs many checks, for example, ensuring that all needed fields are present on the grid. To calculate the metrics, run the `calculate` bound method. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "metric.calculate()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now that the metrics have been calculated we highlight the most useful attributes of the `Metric class`.\n", + "\n", + "`metric.names` gives the names of the metrics as a list, in calculation order. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "metric.names" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`metric.values` gives the values of the metrics as a list, in calculation order. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "metric.values" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "A function is available to get the value of a given metric, given a name." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "metric.value(\"me\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 3: Use umami calculations without the `Metric` class\n", + "\n", + "Each of the umami calculation functions (for example `aggregate`) can be used as a stand-alone function without the `Metric` class. For example, the following is how one would use it to perform the two calculations done above:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "aggregate(grid, method=\"mean\", field=\"topographic__elevation\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "aggregate(grid, method=\"percentile\", field=\"topographic__elevation\", q=10)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The API documentation for each calculation shows its use as part of a `Metric` or `Residual`, and as a stand-alone function.\n", + "\n", + "## Step 4: Add new metrics\n", + "\n", + "Next we will highlight the ability to add new metrics to the umami `Metric` by adding the variance and the 30th percentile. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "new_metrics = {\n", + " \"va\": {\n", + " \"_func\": \"aggregate\",\n", + " \"method\": \"var\",\n", + " \"field\": \"topographic__elevation\"\n", + " },\n", + " \"ep30\": {\n", + " \"_func\": \"aggregate\",\n", + " \"method\": \"percentile\",\n", + " \"field\": \"topographic__elevation\",\n", + " \"q\": 30\n", + " }\n", + "}\n", + "\n", + "metric.add_from_dict(new_metrics)\n", + "metric.calculate()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "First we examine the names, " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "metric.names" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "And then the values. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "metric.values" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 5: Write output\n", + "\n", + "Umami was designed to interface well with model analysis tools like [`scipy.optimize`](https://docs.scipy.org/doc/scipy/reference/optimize.html) or [Dakota](http://dakota.sandia.gov). To this end, in addition to the output provided in `metric.names` and `metric.values`, two additional methods for writing output are built in:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "out = StringIO()\n", + "metric.write_metrics_to_file(out, style=\"dakota\")\n", + "file_contents = out.getvalue().splitlines()\n", + "for line in file_contents:\n", + " print(line.strip())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "out = StringIO()\n", + "metric.write_metrics_to_file(out, style=\"yaml\")\n", + "file_contents = out.getvalue().splitlines()\n", + "for line in file_contents:\n", + " print(line.strip())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Next steps\n", + "\n", + "Now that you have a sense for how the `Metric` class is used, try the next notebook: [Part 2: Introduction to Umami and the `Residual` Class](IntroductionToResidual.ipynb)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/notebooks/IntroductionToResidual.ipynb b/notebooks/IntroductionToResidual.ipynb new file mode 100644 index 0000000..1b60348 --- /dev/null +++ b/notebooks/IntroductionToResidual.ipynb @@ -0,0 +1,306 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Part 2: Introduction to Umami and the `Residual` Class\n", + "\n", + "Umami is a package for calculating metrics for use with for Earth surface dynamics models. This notebook is the second half of a two-part introduction to using umami.\n", + "\n", + "## Scope of this tutorial\n", + "\n", + "Before starting this tutorial, you should have completed [Part 1: Introduction to Umami and the `Metric` Class](IntroductionToMetric.ipynb).\n", + "\n", + "In this tutorial you will learn the basic principles behind using the `Residual` class to compare models and data using terrain statistics. \n", + "\n", + "If you have comments or questions about the notebooks, the best place to get help is through [GitHub Issues](https://github.com/TerrainBento/umami/issues).\n", + "\n", + "To begin this example, we will import the required python packages. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import warnings\n", + "warnings.filterwarnings('ignore')\n", + "\n", + "from io import StringIO\n", + "import numpy as np\n", + "from landlab import RasterModelGrid, imshow_grid\n", + "from umami import Residual" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 1 Create grids\n", + "\n", + "Unlike the first notebook, here we need to compare model and data. We will create two grids, the `model_grid` and the `data_grid` each with a field called field called `topographic__elevation` to it. Both are size (10x10). The `data_grid` slopes to the south-west, while the `model_grid` has some additional noise added to it. \n", + "\n", + "First, we construct and plot the `data_grid`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "data_grid = RasterModelGrid((10, 10))\n", + "data_z = data_grid.add_zeros(\"node\", \"topographic__elevation\")\n", + "data_z += data_grid.x_of_node + data_grid.y_of_node\n", + "\n", + "imshow_grid(data_grid, data_z)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we construct and plot `model_grid`. It differs only in that it has random noise added to the core nodes. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "np.random.seed(42)\n", + "\n", + "model_grid = RasterModelGrid((10, 10))\n", + "model_z = model_grid.add_zeros(\"node\", \"topographic__elevation\")\n", + "model_z += model_grid.x_of_node + model_grid.y_of_node\n", + "model_z[model_grid.core_nodes] += np.random.randn(model_grid.core_nodes.size)\n", + "imshow_grid(model_grid, model_z)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can difference the two grids to see how they differ. As expected, it looks like normally distributed noise. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "imshow_grid(model_grid, data_z - model_z, cmap=\"seismic\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This example shows a difference map with 64 residuals on it. A more realistic application with a much larger domain would have tens of thousands. Methods of model analysis such as calibration and sensitivity analysis need model output, such as the topography shown here, to be distilled into a smaller number of values. This is the task that umami facilitates. \n", + "\n", + "## Step 2: Construct an umami `Residual`\n", + "\n", + "Similar to constructing a `Metric`, a residual is specified by a dictionary or YAML-style input file. \n", + "\n", + "Here we repeat some of the content of the prior notebook:\n", + "\n", + "Each calculation gets its own unique name (the key in the dictionary), and is associated with a value, a dictionary specifying exactly what should be calculated. The only value of the dictionary required by all umami calculations is `_func`, which indicates which of the [`umami.calculations`](https://umami.readthedocs.io/en/latest/umami.calculations.html) will be performed. Subsequent elements of this dictionary are the required inputs to the calculation function and are described in their documentation. \n", + "\n", + "Note that some calculations listed in the [`umami.calculations`](https://umami.readthedocs.io/en/latest/umami.calculations.html) submodule are valid for both the umami `Metric` and `Residual` classes, while others are for `Residual`s only (the `Metric` class was covered in [Part 1](IntroductionToMetric.ipynb) of this notebook series). \n", + "\n", + "The order that calculations are listed is read in as an [OrderedDict](https://docs.python.org/3/library/collections.html#collections.OrderedDict) and retained as the \"calculation order\". \n", + "\n", + "In our example we will use the following dictionary: \n", + "\n", + "```python\n", + "residuals = {\n", + " \"me\": {\n", + " \"_func\": \"aggregate\",\n", + " \"method\": \"mean\",\n", + " \"field\": \"topographic__elevation\"\n", + " },\n", + " \"ep10\": {\n", + " \"_func\": \"aggregate\",\n", + " \"method\": \"percentile\",\n", + " \"field\": \"topographic__elevation\",\n", + " \"q\": 10\n", + " }\n", + "}\n", + "```\n", + "This specifies calculation of the mean of `topographic__elevation` (to be called \"me\") and the 10th percentile `topographic__elevation` (called \"ep10\"). The equivalent portion of a YAML input file would look like:\n", + "\n", + "```yaml\n", + "residuals:\n", + " me:\n", + " _func: aggregate\n", + " method: mean\n", + " field: topographic__elevation\n", + " ep10:\n", + " _func: aggregate\n", + " method: percentile\n", + " field: topographic__elevation\n", + " q: 10\n", + "```\n", + "\n", + "The following code constructs the `Residual`. Note that the only difference with the prior notebook is that instead of specifying only one grid, here we provide two. Under the hood umami checkes that the grids are compatible and will raise errors if they are not." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "residuals = {\n", + " \"me\": {\n", + " \"_func\": \"aggregate\",\n", + " \"method\": \"mean\",\n", + " \"field\": \"topographic__elevation\"\n", + " },\n", + " \"ep10\": {\n", + " \"_func\": \"aggregate\",\n", + " \"method\": \"percentile\",\n", + " \"field\": \"topographic__elevation\",\n", + " \"q\": 10\n", + " }\n", + "}\n", + "residual = Residual(model_grid, data_grid, residuals=residuals)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To calculate the residuals, run the `calculate` bound method. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "residual.calculate()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Just like `Metric` classes, the `Residual` has some usefull methods and attributes. \n", + "\n", + "`residual.names` gives the names as a list, in calculation order. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "residual.names" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`residual.values` gives the values as a list, in calculation order. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "residual.values" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "And a function is available to get the value of a given metric." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "residual.value(\"me\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 5: Write output\n", + "\n", + "The methods for writing output avaiable in `Metric` are also provided by `Residual`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "out = StringIO()\n", + "residual.write_residuals_to_file(out, style=\"dakota\")\n", + "file_contents = out.getvalue().splitlines()\n", + "for line in file_contents:\n", + " print(line.strip())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "out = StringIO()\n", + "residual.write_residuals_to_file(out, style=\"yaml\")\n", + "file_contents = out.getvalue().splitlines()\n", + "for line in file_contents:\n", + " print(line.strip())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Next steps\n", + "\n", + "Now that you have a sense for how the `Metric` and `Residual` classes are used, try the next notebook: [Part 3: Example application](ExampleApplication.ipynb)." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/notebooks/Simple Umami Application.ipynb b/notebooks/Simple Umami Application.ipynb deleted file mode 100644 index 8ae6822..0000000 --- a/notebooks/Simple Umami Application.ipynb +++ /dev/null @@ -1,296 +0,0 @@ -{ - "cells": [ - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import warnings\n", - "warnings.filterwarnings('ignore')\n", - "\n", - "from io import StringIO\n", - "from itertools import product\n", - "\n", - "import numpy as np\n", - "\n", - "import pandas as pd\n", - "\n", - "import matplotlib\n", - "import matplotlib.pyplot as plt\n", - "%matplotlib inline\n", - "\n", - "from plotnine import *\n", - "\n", - "import holoviews as hv\n", - "hv.notebook_extension('matplotlib')\n", - "\n", - "from landlab import imshow_grid\n", - "from terrainbento import Basic\n", - "from umami import Metric, Residual" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "spec_string = \"\"\"\n", - "# Create the Clock.\n", - "clock:\n", - " start: 0\n", - " step: 500\n", - " stop: {duration}\n", - "\n", - "# Create the Grid\n", - "grid: \n", - " RasterModelGrid: \n", - " - [50, 80]\n", - " - xy_spacing: 100\n", - " - fields: \n", - " node: \n", - " topographic__elevation:\n", - " random:\n", - " where: CORE_NODE\n", - " constant:\n", - " value: 100\n", - " \n", - "# Set up Boundary Handlers\n", - "boundary_handlers: \n", - " SingleNodeBaselevelHandler: \n", - " outlet_id: 40\n", - " lowering_rate: -{lowering_rate}\n", - "\n", - "# Parameters that control output.\n", - "output_interval: 1e3\n", - "save_first_timestep: True\n", - "output_prefix: \n", - " simple_application.{name}.\n", - "fields: \n", - " - topographic__elevation\n", - "\n", - "# Parameters that control process and rates.\n", - "water_erodibility: {water_erodibility}\n", - "m_sp: 0.5\n", - "n_sp: 1.0\n", - "regolith_transport_parameter: 0.1\n", - "\"\"\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "truth_duration = 1e4\n", - "truth_water_erodibility = 0.0005\n", - "\n", - "lowering_rate = 100/truth_duration\n", - "\n", - "truth_params = StringIO(\n", - " spec_string.format(duration=truth_duration,\n", - " water_erodibility=truth_water_erodibility,\n", - " lowering_rate=lowering_rate, \n", - " name=\"truth\"))\n", - "np.random.seed(42)\n", - "truth = Basic.from_file(truth_params)\n", - "truth.run()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "ds = truth.to_xarray_dataset(time_unit='years', space_unit='meters')\n", - "hvds_topo = hv.Dataset(ds.topographic__elevation)\n", - "topo = hvds_topo.to(hv.Image, ['x', 'y'],\n", - " label='Truth').options(interpolation='bilinear',\n", - " cmap='viridis',\n", - " colorbar=True)\n", - "topo" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "ds.close()\n", - "truth.remove_output_netcdfs()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "residual_string = \"\"\"\n", - "z_me:\n", - " _func: aggregate\n", - " method: mean\n", - " field: topographic__elevation\n", - "z_p10:\n", - " _func: aggregate\n", - " method: percentile\n", - " field: topographic__elevation\n", - " q: 10\n", - "z_wsmean:\n", - " _func: watershed_aggregation\n", - " field: topographic__elevation\n", - " method: mean\n", - " outlet_id: 40\n", - "ksw_z:\n", - " _func: kstest_watershed\n", - " outlet_id: 40\n", - " field: topographic__elevation\n", - "ksw_da:\n", - " _func: kstest_watershed\n", - " outlet_id: 40\n", - " field: drainage_area\n", - "ksw_s:\n", - " _func: kstest_watershed\n", - " outlet_id: 40\n", - " field: topographic__steepest_slope\n", - "\"\"\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "resolution = 10\n", - "durations = np.logspace(3, 5, num=resolution)\n", - "water_erodibilitys = np.logspace(-4, -2, num=resolution)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "out = {}\n", - "for i, (duration, water_erodibility) in enumerate(product(durations, water_erodibilitys)):\n", - " lowering_rate = 100/duration\n", - " test_params = StringIO(\n", - " spec_string.format(duration=duration, \n", - " water_erodibility=water_erodibility, \n", - " lowering_rate=lowering_rate, \n", - " name=i))\n", - " #np.random.seed(42)\n", - " test = Basic.from_file(test_params)\n", - " test.run()\n", - "\n", - " test.remove_output_netcdfs()\n", - " \n", - " residual = Residual(test.grid, truth.grid)\n", - " residual.add_from_file(StringIO(residual_string))\n", - " residual.calculate()\n", - "\n", - " values = {name: residual.value(name) for name in residual.names}\n", - " out[(duration, water_erodibility)] = values" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "df = pd.DataFrame.from_dict(out, orient=\"index\")\n", - "df.index.names = [\"duration\", \"water_erodibility\"]\n", - "df.head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "df_melt = df.reset_index().melt(id_vars=[\"duration\", \"water_erodibility\"])\n", - "df_melt[\"squared_residual\"] = df_melt.value**2\n", - "df_melt.head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "p1 = (ggplot(df_melt[df_melt.variable.str.startswith(\"ksw\")],\n", - " aes(x=\"duration\", y=\"water_erodibility\", fill=\"squared_residual\")) +\n", - " geom_tile() + \n", - " geom_point(aes(x=truth_duration, y=truth_water_erodibility)) +\n", - " scale_fill_continuous(limits=[0.001,1], trans=\"log10\") +\n", - " facet_wrap(\"~variable\") + \n", - " theme_bw() + \n", - " scale_x_log10() +\n", - " scale_y_log10() +\n", - " coord_equal())\n", - "\n", - "p2 = (ggplot(df_melt[df_melt.variable.str.startswith(\"z\")],\n", - " aes(x=\"duration\", y=\"water_erodibility\", fill=\"squared_residual\")) +\n", - " geom_tile() + \n", - " scale_fill_continuous(limits=[0.001,1000], trans=\"log10\") +\n", - " geom_point(aes(x=truth_duration, y=truth_water_erodibility)) +\n", - " facet_wrap(\"~variable\") + theme_bw() + scale_x_log10() +\n", - " scale_y_log10() + coord_equal())\n", - "\n", - "print(p1)\n", - "print(p2)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.4" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} diff --git a/notebooks/Welcome.ipynb b/notebooks/Welcome.ipynb new file mode 100644 index 0000000..00530fc --- /dev/null +++ b/notebooks/Welcome.ipynb @@ -0,0 +1,64 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Umami notebooks\n", + "\n", + "Welcome to the umami notebooks. This page provides links to notebooks that provide an introduction to umami and its use. We recommend that you look at them in the following order.\n", + "\n", + "First, look at two notebooks designed to introduce you to the core classes and methods of umami.\n", + "\n", + " * [Part 1: Introduction to umami and the `Metric` class](IntroductionToMetric.ipynb)\n", + " * [Part 2: Introduction to the `Residual` class](IntroductionToResidual.ipynb)\n", + " \n", + "Then, look at two notebooks with example applications.\n", + " \n", + " * [Part 3: Example Application](ExampleApplication.ipynb)\n", + " * [Part 4: Application using the Discretized Misfit calculation](DiscretizedMisfit.ipynb)\n", + " \n", + "If you have comments or questions about the notebooks, the best place to get help is through [GitHub Issues](https://github.com/TerrainBento/umami/issues).\n", + "\n", + "## Some background\n", + "\n", + "### What is umami?\n", + "\n", + "Umami is a package for calculating objective functions or objective function components for Earth surface dynamics modeling. It was designed to work well with [terrainbento](https://github.com/TerrainBento/terrainbento) and other models built with the [Landlab Toolkit](https://github.com/landlab/landlab).\n", + "\n", + "Umami offers two primary classes:\n", + "* a[`Residual`](https://umami.readthedocs.io/en/latest/umami.residual.html#Residual),\n", + "which represents the difference between model and data, and \n", + "* a [`Metric`](https://umami.readthedocs.io/en/latest/umami.metric.html),\n", + "which is a calculated value on either model or data. \n", + "\n", + "The set of currently supported calculations can be found in the [`umami.calculations`](https://umami.readthedocs.io/en/latest/umami.calculations.html) submodule.\n", + "\n", + "### What does it do well?\n", + "\n", + "Umami was designed to provide an input-file based interface for calculating single-value landscape metrics for use in model analysis. This supports reproducible analysis and systematic variation in metric construction. When used with `terrainbento`, one input file can describe the model run, and one input file can describe the model assessment or model-data comparison. This streamlines model analysis applications. Umami also provides multiple output formats (YAML and Dakota), the latter of which is designed to interface with Sandia National Laboratory's [Dakota package](https://dakota.sandia.gov)." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/scripts/submit_coveralls.sh b/scripts/submit_coveralls.sh deleted file mode 100644 index 251465c..0000000 --- a/scripts/submit_coveralls.sh +++ /dev/null @@ -1,2 +0,0 @@ -#!/usr/bin/env bash -COVERALLS_REPO_TOKEN=GPzz5lrNDJuA4MQNPWFT6gomXijaPZ7bS coveralls diff --git a/scripts/test_umami_with_coverage.sh b/scripts/test_umami_with_coverage.sh index 2bd081a..907539d 100644 --- a/scripts/test_umami_with_coverage.sh +++ b/scripts/test_umami_with_coverage.sh @@ -1,2 +1,2 @@ #!/usr/bin/env bash -pytest umami --doctest-modules --cov=umami tests/ --cov-report term-missing +pytest umami tests/ --doctest-modules --cov=umami --cov-report term-missing -vvv diff --git a/setup.cfg b/setup.cfg index f669716..2aa28d2 100644 --- a/setup.cfg +++ b/setup.cfg @@ -31,11 +31,15 @@ doctest_optionflags = omit = setup.py versioneer.py umami/_version.py + [coverage:report] exclude_lines = - pragma: no cover - if __name__ == .__main__. - + pragma: no cover + if __name__ == .__main__. +omit = + setup.py + versioneer.py + umami/_version.py [isort] multi_line_output=3 diff --git a/setup.py b/setup.py index 47a18b2..f7e1ca3 100644 --- a/setup.py +++ b/setup.py @@ -7,6 +7,15 @@ name="umami", python_requires=">=3.6", version=versioneer.get_version(), + classifiers=[ + "Intended Audience :: Science/Research", + "License :: OSI Approved :: MIT License", + "Operating System :: OS Independent", + "Programming Language :: Python :: 3.6", + "Programming Language :: Python :: 3.7", + "Programming Language :: Python :: Implementation :: CPython", + "Topic :: Scientific/Engineering :: Physics", + ], cmdclass=versioneer.get_cmdclass(), description="Umami calculates landscape metrics", url="https://github.com/TerrainBento/umami/", @@ -17,9 +26,5 @@ long_description_content_type="text/markdown", zip_safe=False, packages=find_packages(), - install_requires=[ - "scipy", - "numpy", - "landlab>=1.10", - ], + install_requires=["scipy", "numpy", "landlab>=1.10.1"], ) diff --git a/tests/conftest.py b/tests/conftest.py index d8e813f..2751631 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -1,6 +1,5 @@ import numpy as np import pytest - from landlab import RasterModelGrid diff --git a/tests/test_code_discretized_misfit.py b/tests/test_code_discretized_misfit.py index 079e205..3954d79 100644 --- a/tests/test_code_discretized_misfit.py +++ b/tests/test_code_discretized_misfit.py @@ -2,8 +2,8 @@ import numpy as np import pytest - from landlab import RasterModelGrid + from umami.calculations import discretized_misfit from umami.calculations.residual.discretized_misfit import _get_category_labels diff --git a/tests/test_code_joint_density_misfit.py b/tests/test_code_joint_density_misfit.py index 91a4ef9..c7021bd 100644 --- a/tests/test_code_joint_density_misfit.py +++ b/tests/test_code_joint_density_misfit.py @@ -1,7 +1,7 @@ import numpy as np import pytest - from landlab import RasterModelGrid + from umami.calculations import joint_density_misfit diff --git a/umami/calculations/metric/aggregate.py b/umami/calculations/metric/aggregate.py index a430ef4..dbf1e89 100644 --- a/umami/calculations/metric/aggregate.py +++ b/umami/calculations/metric/aggregate.py @@ -15,7 +15,7 @@ def _aggregate(vals, method, **kwds): def aggregate(grid, field, method, **kwds): - """Calculate an aggreggate value on a landlab grid field. + """Calculate an aggreggate value on a Landlab grid field. ``aggregate`` calculates aggregate values on the core nodes of the model grid. It supports all methods in the `numpy`_ namespace that reduce an @@ -27,7 +27,7 @@ def aggregate(grid, field, method, **kwds): ---------- grid : Landlab model grid field : str - An at-node landlab grid field that is present on the model grid. + An at-node Landlab grid field that is present on the model grid. method : str The name of a numpy namespace method. **kwds diff --git a/umami/calculations/metric/chi_intercept_gradient.py b/umami/calculations/metric/chi_intercept_gradient.py index 8600301..b9a3aaf 100644 --- a/umami/calculations/metric/chi_intercept_gradient.py +++ b/umami/calculations/metric/chi_intercept_gradient.py @@ -12,7 +12,7 @@ def _validate_chi_finder(chi_finder): def chi_intercept(chi_finder): r"""Return the intercept to a linear fit through a :math:`\chi`-z plot. - This is a loose wrapper around the function + This is a loose wrapper around the Landlab function `ChiFinder.best_fit_chi_elevation_gradient_and_intercept`_. .. _ChiFinder.best_fit_chi_elevation_gradient_and_intercept: https://landlab.readthedocs.io/en/release/landlab.components.chi_index.html#landlab.components.chi_index.ChiFinder.best_fit_chi_elevation_gradient_and_intercept @@ -75,7 +75,7 @@ def chi_intercept(chi_finder): def chi_gradient(chi_finder): r"""Return the slope to a linear fit through a :math:`\chi`-z plot. - This is a loose wrapper around the function + This is a loose wrapper around the Landlab function `ChiFinder.best_fit_chi_elevation_gradient_and_intercept`_. .. _ChiFinder.best_fit_chi_elevation_gradient_and_intercept: https://landlab.readthedocs.io/en/release/landlab.components.chi_index.html#landlab.components.chi_index.ChiFinder.best_fit_chi_elevation_gradient_and_intercept diff --git a/umami/calculations/metric/count_equal.py b/umami/calculations/metric/count_equal.py index 19645dc..62e9db8 100644 --- a/umami/calculations/metric/count_equal.py +++ b/umami/calculations/metric/count_equal.py @@ -11,7 +11,7 @@ def count_equal(grid, field, value): ---------- grid : Landlab model grid field : str - An at-node landlab grid field that is present on the model grid. + An at-node Landlab grid field that is present on the model grid. value : float The value to identify within the field. diff --git a/umami/calculations/metric/hypsometric_integral.py b/umami/calculations/metric/hypsometric_integral.py index 4a9c748..b53733e 100644 --- a/umami/calculations/metric/hypsometric_integral.py +++ b/umami/calculations/metric/hypsometric_integral.py @@ -1,5 +1,4 @@ import numpy as np - from landlab.utils import get_watershed_mask diff --git a/umami/calculations/metric/watershed_aggregation.py b/umami/calculations/metric/watershed_aggregation.py index 1999a4e..8604c59 100644 --- a/umami/calculations/metric/watershed_aggregation.py +++ b/umami/calculations/metric/watershed_aggregation.py @@ -1,5 +1,4 @@ import numpy as np - from landlab.utils import get_watershed_mask from .aggregate import _aggregate @@ -18,7 +17,7 @@ def watershed_aggregation(grid, field, outlet_id, method, **kwds): ---------- grid : Landlab model grid field : str - An at-node landlab grid field that is present on the model grid. + An at-node Landlab grid field that is present on the model grid. outlet_id : int Outlet id of the watershed. method : str diff --git a/umami/calculations/residual/discretized_misfit.py b/umami/calculations/residual/discretized_misfit.py index ecf394b..88abb93 100644 --- a/umami/calculations/residual/discretized_misfit.py +++ b/umami/calculations/residual/discretized_misfit.py @@ -13,24 +13,73 @@ def discretized_misfit( field_1_percentile_edges, field_2_percentile_edges, ): - """Calculate a discretized misfit on a landlab grid field. - - density bounds calculated with the data grid. - - # TODO: - ALSO FIX how these get named. - + """Calculate a discretized misfit on a Landlab grid field. + + The following Binder notebook shows example usage of this umami + calculation. + + .. image:: https://mybinder.org/badge_logo.svg + :target: https://mybinder.org/v2/gh/TerrainBento/umami/master?filepath=notebooks%2FDiscretizedMisfit.ipynb + + The ``discretized_misfit`` calculation first classifies each grid cell in + the landscape into categories based on ``field_1``, ``field_2`` and the + percentile edges for each (using the data grid). This results in a set of + categories, which may or may not be congiguous in space. This category + field is then stored as a property of the ``Residual`` called + ``Residual.category``. + + For each category, the sum of squared residuals is calculated based on the + ``misfit_field``. + + Since this calculation returns one value for each category, rather than + one value in total, a ``name`` must be provided. This is a string that + will be formatted with the values for ``{field_1_level}`` and + ``{field_2_level}``. The output is an ordered dictionary with ``name`` as + the keys, and the sum of squares misfit as the values. + + For example, if ``field_1`` was the ``drainage_area`` and ``field_2`` was + ``topographic__elevation``, and the parameter ``name`` was + ``da_{field_1_level}_z_{field_2_level}``, then the following four + categories would be identified: + + +--------------+-------------------------------------------------------+ + | Name | Contents | + +==============+=======================================================+ + | ``da_0_z_0`` | Cells with the lower half of drainage area, and then | + | | within those cells, the lowest half of elevation | + +--------------+-------------------------------------------------------+ + | ``da_0_z_1`` | Cells with the lower half of drainage area, and then | + | | within those cells, the higher half of elevation | + +--------------+-------------------------------------------------------+ + | ``da_1_z_0`` | Cells with the higher half of drainage area, and then | + | | within those cells, the lowest half of elevation | + +--------------+-------------------------------------------------------+ + | ``da_1_z_1`` | Cells with the higher half of drainage area, and then | + | | within those cells, the higher half of elevation | + +--------------+-------------------------------------------------------+ + + Within each of these four categories, the sum of squared residuals on the + ``misfit_field`` is calculated and returned. Parameters ---------- model_grid : Landlab model grid data_grid : Landlab model grid - name - misfit_field - field_1 - field_2 - field_1_percentile_edges - field_2_percentile_edges + name : str + misfit_field : str + An at-node Landlab grid field that is present on the model grid. + field_1 : str + An at-node Landlab grid field that is present on the model grid. + field_2 : str + An at-node Landlab grid field that is present on the model grid. + field_1_percentile_edges : list + A list of percentile edges applied to ``field_1``. For example, + ``[0, 60, 100]`` specifies splitting ``field_1`` into two parts, + separated at the 60th percentile. + field_2_percentile_edges : list + A list of percentile edges applied to ``field_2``. For example, + ``[0, 60, 100]`` specifies splitting ``field_2`` into two parts, + separated at the 60th percentile. Returns ------- @@ -56,7 +105,6 @@ def discretized_misfit( >>> data_fa.run_one_step() >>> model_fa = FlowAccumulator(model) >>> model_fa.run_one_step() - >>> cat, out = discretized_misfit( ... model, ... data, diff --git a/umami/calculations/residual/joint_density_misfit.py b/umami/calculations/residual/joint_density_misfit.py index a7911a8..3381369 100644 --- a/umami/calculations/residual/joint_density_misfit.py +++ b/umami/calculations/residual/joint_density_misfit.py @@ -9,18 +9,26 @@ def joint_density_misfit( field_1_percentile_edges, field_2_percentile_edges, ): - """Calculate the joint-density misfit on a landlab grid field. + """Calculate the joint-density misfit on a Landlab grid field. - density bounds calculated with the data grid. + Density bounds are calculated with the data grid. Parameters ---------- model_grid : Landlab model grid data_grid : Landlab model grid - field_1 - field_2 - field_1_percentile_edges - field_2_percentile_edges + field_1 : str + An at-node Landlab grid field that is present on the model grid. + field_2 : str + An at-node Landlab grid field that is present on the model grid. + field_1_percentile_edges : list + A list of percentile edges applied to ``field_1``. For example, + ``[0, 60, 100]`` specifies splitting ``field_1`` into two parts, + separated at the 60th percentile. + field_2_percentile_edges : list + A list of percentile edges applied to ``field_2``. For example, + ``[0, 60, 100]`` specifies splitting ``field_2`` into two parts, + separated at the 60th percentile. Returns ------- diff --git a/umami/calculations/residual/kstest.py b/umami/calculations/residual/kstest.py index 17ae983..7f5ecf7 100644 --- a/umami/calculations/residual/kstest.py +++ b/umami/calculations/residual/kstest.py @@ -1,11 +1,10 @@ import numpy as np -from scipy.stats import ks_2samp - from landlab.utils import get_watershed_mask +from scipy.stats import ks_2samp def kstest(model_grid, data_grid, field): - """Calculate an Kolmogorov-Smirnov test for a landlab grid field. + """Calculate an Kolmogorov-Smirnov test for a Landlab grid field. ``kstest`` calculates the Kolmogorov-Smirnov test for goodness of fit using the function ``ks_2samp`` from ``scipy.stats``. @@ -15,7 +14,7 @@ def kstest(model_grid, data_grid, field): model_grid : Landlab model grid data_grid : Landlab model grid field : str - An at-node landlab grid field that is present on the model grid. + An at-node Landlab grid field that is present on both grids. Returns ------- @@ -81,7 +80,7 @@ def kstest_watershed(model_grid, data_grid, field, outlet_id): model_grid : Landlab model grid data_grid : Landlab model grid field : str - An at-node landlab grid field that is present on the model grid. + An at-node Landlab grid field that is present on both grids. outlet_id : int Returns diff --git a/umami/metric.py b/umami/metric.py index b140f4d..2798ac6 100755 --- a/umami/metric.py +++ b/umami/metric.py @@ -4,9 +4,9 @@ import numpy as np import yaml +from landlab import RasterModelGrid, create_grid import umami.calculations.metric as calcs -from landlab import RasterModelGrid, create_grid from umami.utils.create_landlab_components import _create_landlab_components from umami.utils.io import _read_input, _write_output from umami.utils.validate import _validate_fields, _validate_func @@ -40,18 +40,6 @@ def __init__( A dictionary of desired metrics to calculate. See examples for required format. - Attributes - ---------- - names - values - - Functions - --------- - add_from_dict - add_from_file - calculate - write_metrics_to_file - Examples -------- >>> from io import StringIO diff --git a/umami/residual.py b/umami/residual.py index 2902753..2880ad2 100644 --- a/umami/residual.py +++ b/umami/residual.py @@ -4,11 +4,11 @@ import numpy as np import yaml +from landlab import RasterModelGrid, create_grid from numpy.testing import assert_array_equal import umami.calculations.metric as metric_calcs import umami.calculations.residual as residual_calcs -from landlab import RasterModelGrid, create_grid from umami.metric import Metric from umami.utils.create_landlab_components import _create_landlab_components from umami.utils.io import _read_input, _write_output @@ -47,18 +47,6 @@ def __init__( A dictionary of desired residuals to calculate. See examples for required format. - Attributes - ---------- - names - values - - Functions - --------- - add_from_dict - add_from_file - calculate - write_residuals_to_file - Examples -------- >>> import numpy as np @@ -141,7 +129,9 @@ def __init__( self._validate_residuals(self._residuals) self._distinguish_metric_from_resid() - # calculate + self._category = None + + # set up metric and residual objects self._data_metric = Metric( data, flow_accumulator_kwds=flow_accumulator_kwds, @@ -176,7 +166,14 @@ def names(self): @property def category(self): - """ """ + """Category labels from discretized_misfit. + + Returns + ------- + category: ndarray of size (number of nodes,) + The category labels used with ``discretized_misfit`` or ``None`` if + this calculation is not used. + """ return self._category def value(self, name):