Skip to content

MSE-NCCOS-NOAA/ML-for-Characterizing-Seafloor-Habitats

Repository files navigation

ML-for-Characterizing-Seafloor-Habitats

Marine managers routinely use spatial data and maps to make decisions about the resources in their jurisdiction. These spatial datasets and maps are critical for managers to establish baselines, and detect changes overtime in the health, abundance and distribution of marine resources, including benthic habitats. In the past, benthic habitat maps were developed by visually delineating and classifying features in aerial or satellite images. This approach was time consuming and subjective. In the last decade, advances in spatial modeling techniques, including machine learning and deep learning approaches, now make it easier to standardize the process used to characterize benthic habitats. These approaches also make it easier to quantify the uncertainty and precision associated with the characterization process. Both advances in habitat map making are critical for managers in a changing climate, allowing them to better track habitat changes over time, and to better understand the error bars around those changes at broad spatial scales (10s to 1000s of kilometers).

This code base uses boosted regression trees (BRTs) to develop benthic habitat predictions and boosted regression trees (BCTs) to dvelop benthic habitat maps. We used this machine learning technique because it is flexible, robust, and compares favorably to other machine learning techniques (Elith et al. 2006; Elith et al. 2008, De'eath and Fabricius 2000; De'ath 2007). BRTs model complex relationships between organisms and the environment by developing many (hundreds to thousands) simple regression (tree) models. Regression trees (Breiman et al., 1984) relate a response to environmental predictors by iteratively splitting the data into two homogenous groups. These models are built in a stage-wise fashion, where existing trees are left unchanged and the variance remaining from the last tree is used to fit the next one. This stage-wise process is called boosting. A random subset of data is used to fit a model at each stage. This randomization helps improve model performance (Friedman, 2002; Elith et al., 2008). These simple models are then combined linearly to produce one final combined model (Elith et al., 2008). The fitted values in this combined model are more stable than values from an individual model, improving its overall predictive performance (Friedman, 2002; Elith et al., 2006, Elith et al., 2008).

Legal Disclaimer

This repository is a software product and is not official communication of the National Oceanic and Atmospheric Administration (NOAA), or the United States Department of Commerce (DOC). All NOAA GitHub project code is provided on an 'as is' basis and the user assumes responsibility for its use. Any claims against the DOC or DOC bureaus stemming from the use of this GitHub project will be governed by all applicable Federal law. Any reference to specific commercial products, processes, or services by service mark, trademark, manufacturer, or otherwise, does not constitute or imply their endorsement, recommendation, or favoring by the DOC. The DOC seal and logo, or the seal and logo of a DOC bureau, shall not be used in any manner to imply endorsement of any commercial product or activity by the DOC or the United States Government.

License

Software code created by U.S. Government employees is not subject to copyright in the United States (17 U.S.C. §105). The United States/Department of Commerce reserve all rights to seek and obtain copyright protection in countries other than the United States for Software authored in its entirety by the Department of Commerce. To this end, the Department of Commerce hereby grants to Recipient a royalty-free, nonexclusive license to use, copy, and create derivative works of the Software outside of the United States.

Section 508

The National Ocean Service is committed to making its website accessible to the widest possible audience, including people with disabilities, in accordance with Section 508 of the Rehabilitation Act (29 U.S.C. 794d). Section 508 is a federal law that requires agencies to provide individuals with disabilities equal access to electronic information and data comparable to those who do not have disabilities, unless an undue burden would be imposed on the agency. The Section 508 standards are the technical requirements and criteria that are used to measure conformance within this law. More information on Section 508 and the technical standards can be found at Section508.gov.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published