Skip to content

The publication is a collection of sample code to show how data from SAP and non-SAP systems can be made available for training in ANY hyperscaler machine learning service via several layers of abstraction from data connection to training using our FedML Python libraries.

License

SAP-samples/datasphere-fedml

REUSE status

FedML

Description


The SAP Federated ML Python libraries (FedML) applies the Data Federation architecture of SAP Datasphere for intelligently sourcing SAP as well as non-SAP data for Machine Learning experiments done at the ML platforms removing the need for replicating or moving data. By abstracting the Data Connection, Data load (for all ML platforms) and Model training (with flexibility and provision for user provided training scripts), Model Deployment, and Inferencing (for Hyperscaler Machine learning platforms) , the FedML library provides end to end integration with few lines of code .

What's New

1. The new version of FedML (available as fedml-dsp in PyPi, V1.0.0) :

  • Is machine learning platform-independent. It can be used in all machine learning platforms
  • Supports NVIDIA RAPIDS™, CUDA cuDF and cuPy and hence can be used for training models in GPU environments.
  • Supports sourcing data from SAP Datasphere models directly into PySpark and cuPy (for GPU) dataframes.
  • Supports SAP AI Core Deployment - Models that are trained in any ML Platform (and containerized independently) can now be deployed in SAP GenAI Hub's AI Core with couple lines of code.
  • Supports writing inferenced results back to SAP Datasphere.

Solution Architecture

ARD

2.FedML (Original, V2.0) for hyperscaler platforms [AWS, GCP, Azure and Databricks] :

  • Is pip installable from PyPi for its respective hyperscaler platforms.
  • Supports model training and deployment to hyperscaler environment.
  • Supports deployment to SAP Business Technology Platform Kyma environment.
  • Supports inferencing with hyperscaler deployed as well as Kyma deployed models.
  • Supports writing inferenced results back to SAP Datasphere.

Solution Architecture - FedML Hyperscaler libraries

ARD

Requirements

  • SAP Datasphere tenant instance, with connectivity established to the remote data sources, and views exposed, that can be consumed by FedML.

  • Access to corresponding Machine learning Platforms with appropriate configurations. See Configuration section.

Download and Installation

Try out examples from the samples-notebooks directory of corresponding library folders

Configuration

  • For FedML (platform-independent) library specific pre-requisites, configuration and documentation, please refer here
  • For AWS FedML library specific pre-requisites, configuration and documentation, please refer here
  • For GCP FedML library specific pre-requisites, configuration and documentation, please refer here
  • For Azure FedML library specific pre-requisites, configuration and documentation, please refer here
  • For Databricks FedML library specific pre-requisites, configuration and documentation, please refer here

Limitations

None

How to obtain support

This project is provided "as-is" with no expectation for major changes or support.
Create an issue in this repository if you find a bug or have questions about the content.
For additional support, ask a question in SAP Community.

Licensing

Copyright (c) 2021 SAP SE or an SAP affiliate company. All rights reserved. This project is licensed under the Apache Software License, version 2.0 except as noted otherwise in the LICENSE file.

About

The publication is a collection of sample code to show how data from SAP and non-SAP systems can be made available for training in ANY hyperscaler machine learning service via several layers of abstraction from data connection to training using our FedML Python libraries.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published