Skip to content

Latest commit

 

History

History
22 lines (19 loc) · 124 KB

markdown_matrix.md

File metadata and controls

22 lines (19 loc) · 124 KB

Comparison Matrix in Markdown

We suggest to click through to the master spreadsheet in google sheets. But if that doesn't work for you then here is a markdown alternative. You may need to use the scrollbar at the bottom in order to scroll right.

Dataiku DSS H2O Databricks AWS SageMaker Azure Machine Learning Google AI Platform (Vertex) Kubeflow Mlflow KNIME DataRobot Pachyderm Seldon
Product Focus Data exploration, model building and excel-style interaction with big data. Visual or code-based pipelines for building and deploying models. Platform with a core compute engine like Spark. Lots of integrations and add-ons, includuing AutoML. Spark and mlflow as open source offerings. SaaS platform with added features and DeltaLake as enterprise. 'Customer-managed' SaaS option on AWS, Azure or Google. MLOps platform on AWS MLOps platform on Azure Vertex AI Platform on Google Open source MLOps for kubernetes Open source flexible approach to MLops Open core low-code visual Data Science and Data Analytics platform with MLOps capabilities. End-to-end Data Science and MLOps Platform. Model building for Data Scientists and Citizen Data Scientists (AutoML). MLOps for deploying and monitoring models. Data Foundation for Machine Learning. Pipelines with automated data versioning, lineage and monitoring. With an open source core and other popular open source libraries, Seldon focuses on model management, deployment, monitoring and explainability.
Recent Activity Launched managed Dataiku Online in June 2021. AWS marketplace launch also in June 2021. Hybrid Cloud launched Jan 2021 but most products falling under it launched before that. Combined platform launched first on AWS, then Azure and on Google Feb 2021. Feature store May 2021. Model serving June 2020. Feature store, wrangler, debugger, pipelines and clarify marked 'New' and were announced Dec 2020 Mlflow integration Oct 2020. Managed endpoints May 2021 and in 'Preview'. Many features in new in the May 2021 Vertex launch and in preview inc feature store, explanations, skew and drift and metadata tracking. Status gets marked in UI. Multi-user added June 2020. Integrated UI for serving and katib April 2021. Autologging released July 2019. Model registry end of 2019. Model schemas/signatures June 2020. Plans for integration h2o Driverless AI were announced Oct 2020. The Integrated Deployment way of deploying workflows was announced April 2020. Acquisition of Algorithmia in July 2021. MLOps support for deploying to any infra - Sept 2021. Snowflake integration July 2021. Launched the new Pachyderm Hub 2.0 in September 2021. Launched 1.0 of enterprise offering (Deploy) Feb 2021. Developing open source MLOps python SDK (Tempo). Soon to graduate open source python-based inference server (MLServer).
Pricing - OSS, per-model etc. Enterprise. Not stated. Enterprise. Not stated. Open source core platform. Open Core. Platform can be pay-per-use (DBUs) metered metered metered Open source. There are also tailored distros and fully managed in google vertex and from Arrikto Open source. Also available integrated into the databricks managed service. KNIME Analytics Platform is free and open source. KNIME Server requires an annual license. Enterprise. Three-year licence with metered pricing plans. Enterprise Edition is self-deploy and Hub is a managed service. Contact pachyderm for pricing of these. Community Edition is open source and free. Open source core + libraries. Enterprise platform priced per model.
Data Preparation - Exploration, cleaning,
feature engineering
Lots of data sources, application of schemas, highlighting and fixing of outliers and missing values, interactive stats, built-in filtering and transformations and sampling. Assisted labelling plugin. h2o-3 supports range of input formats and manipulations. Driverless AI supports even more inputs, has visualization and also automatic and tailored feature engineering. Notebooks. Spark. Cloud provider integrations. Feature store. Data Catalog. Ground truth for labelling, wrangler for feature engineering and feature store, studio IDE, notebooks Data labelling service. Ingestion pipelines. Azure Synapse for prep/wrangling. Vertex feature store, data labelling service, managed/unmanaged datasets, BigQuery, notebooks. Jupyter notebooks Projects and entry points provide a way to package data operations. Inspection of tabular data and generation of plots within Analytics Platform workbench. Various data manipulation nodes available. Many data sources such as jdbc databases, databricks and salesforce. Spark SQL support. Visual wrangling operations. Automated transformation flows. Notebook integration for exploration. Main focus is pipelines and lineage rather than exploration. Data transformation within pipelines is recorded and monitored. Could be used with any exploration or wrangling tool. Integration examples available for data labeling and experimentation. Works well in tandem with data preparation/versioning tools
Model Training - Managed training. Or AutoML. Flows/Pipelines for training/retraining. Lab for assisted building of models/AutoML. Models can also be custom or Spark. Training features in Driverless AI are AutoML. Training features in h2o-3 are for pre-defined algorithms. Mlflow, Spark MLLib and AutoML. Distributed training including using GPUs. AutoML. Pipelines brings together different training options. Lots of support for different languages and distributed vs not distributed. Spot instance option interesting. Pipelines for SageMaker have less prominence in docs than Pipelines for Vertex - less emphasis on orchestration. Container-based training in an Environment or can use a training Pipeline for distributed. Visual editor for assisted pipeline creation or there's also full AutoML. Integrated metadata for pipelines but only in preview. Distributed training for custom training jobs where frameworks supports. GPU-based training for custom training. AutoML fits alongside. Pipelines as orchestrator and training platform. Training operators for specific frameworks. Katib for hyperparameter tuning (AutoML). MLflow does tracking during the training process. Provides structure with projects and entry points and multi-step workflows. Can execute locally or on a hosted backend e.g. kubernetes. Provides tools that put structure around training rather than hosted platform like many others do. Workflows can be used to manipulate data and train models. These can be scheduled to run on KNIME Server to leverage elastic infra. AutoML for building models with feature detection/exploration and customizable model search. Monitoring for model building jobs. Leaderboard for choosing a model from the AutoML search. Training dashboard for insights into the model training process. Data Versioning and Lineage for Pipelines which automates reproducibility for data transformation and model training. Console adds visibility/monitoring. Approach is agnostic to choices of code libraries. Especially well suited to large volumes of unstructured data as well as structured data. Agnostic about model training and frameworks used. Integrations with MLFlow, Kubeflow, Pachyderm etc.
Model Deployment - Batch, real-time, streaming Real-time deploy to API Node. Batch Prediction via Flows. Support for Spark and Spark Streaming. Predictions can be made in h2o-3 and it can integrate to Spark. Models can be deployed within Driverless. But centerpiece deployment tool in the suite is MLOps for its features and integrations. MLFlow serving for real-time. Spark for batch or streaming. Deployment with pre-built or custom containers. Inference pipelines supported. Unified batch and real-time. Deployment with pre-built or custom containers as managed endpoints. Real-time or batch. Deployment with pre-built or custom containers. AutoML similar but easier. Batch support. KFServing as core component with Seldon and other integrations. KFServing has batching integration and a streaming example. Preprocess and postprocess inference steps available. MLflow Models provides structure for packaging and deploying models. Local basic serving is out of the box. Built-in and extensible integrations for specialist hosting with plugins available. Can do batch straight from projects without serving step. Batch or streaming also possible with spark integration. Captured subsets of workflows, which can contain models, can be deployed to KNIME Server. Can be for REST calls or as scheduled jobs. Some nodes support streaming. Workflows containing javascript-enabled interactive nodes can also be deployed with KNIME Server's Web Portal. Real-time and batch prediction options, including high volume batch. Post-processing of prediction data. Can be used with deployment tools to track data at the prediction stage as well as training. Integration examples are available. Can also serve a model API directly using Service concept. Deployment as optimised REST/gRPC microservices. Can be deployed using pre-packaged inference servers or using custom python components. Support for real-time inference, batch predictions and stream processing.
Scaling Predictions - Parallelisation, low-latency/grpc, A/B tests, multi-model Model versioning, rollbacks and traffic splitting. HA and loadbalancing. Export for prediction in other runtines or offline, inc onnx export. In MLOps there's champion/challenger and A/B traffic splitting. Different run-time options for latency optimization. Export to other runtimes inc Java and C++. Versions and stages for models with MLFlow. Relatively new feature and in preview. For traffic splitting and inference pipelines there are integrations Autoscaling. Multi-model. GPU inference and elastic inference for low-latency. Async for large payloads. Traffic splitting. GPUs and traffic splitting available. Traffic splits available. Private endpoints for low-latency. KFServing has autoscaling through knative, traffic splitting, multi-model serving and GPU integration. Depends on the chosen deployment option. The functionality of `mlflow serve` is basic but mlflow is designed for integrations with more features such as sagemaker Challenger rollout modes like A/B testing using Model Process Factory templates. Validation can be performed on model during deployment to KNIME Server. Challenger rollouts and monitoring to compare the challenger in detail. Can take stored predictions that went to the main model and replay against the challenger. Can be used with deployment tools to track data at the prediction stage as well as training. Integration examples are available. Can also serve a model API directly using Service concept. Autoscaling, multi-model inference graphs, A/B tests and progressive rollouts.
Model Performance and Basic Monitoring - Accuracy, metrics etc. Real-time scoring. Metrics. Latency, alerts, custom metrics. Logs and resource monitoring built-in. Invocation metrics to CloudWatch. Payload capture/logging. Custom monitoring schedules to check constraints. Monitoring for latency and hardware resource built-in. Built-in monitoring of latency etc. Resource and high-level request metrics integration. Can be added to UI. Depends on the chosen deployment option as it is the deployment which is monitored. There are integrations but the deployment itself is outside of mlflow open source. Model monitoring component for classfication or can be done more generally using Model Process Factory. For scheduled jobs there's notifications. Status of running workflows can be inspected in the admin UI. Accuracy and availability monitoring for individual models and as a summary. Specialist monitoring for time series. Can be used with deployment/monitoring tools to track data at the prediction stage as well as training. Integration examples are available. Collects metrics by default on your deployed models. Custom metrics can also be defined and there is support for distributed tracing and payload logging
Data and Insights Monitoring - drift, outliers, explainability Drift plug-in. Monitoring at data level as well as drift, anomolies and alerts. For BYOM it relies on schema to handle the data but not for native models. Drift and analysis on predictions is possible and spark streaming can help but does require manual setup. Payload capture/logging. Ground-truth comparison possible for model quality. Skew. Feature attribution (explanations). Payload logging to blob for analysis. Drift detection. Some explanations and skew and drift in pre-GA. Drift, outlier detection and explanations through Seldon Alibi. Also flexible request logging. Depends on the chosen deployment option as it is the deployment which is monitored. There are integrations so it's then not mlflow itself that does the monitoring. Drift and retraining for classifiers with model monitoring component. Also ways to build monitoring yourself. Extension for explanations. Prediction row storage for native models. Drift monitoring for individual models and as a summary. Ability to drill into prediction data. Bias and fairness monitoring. Can be used with deployment/monitoring tools to track data at the prediction stage as well as training. Integration examples are available. Drift detection, outlier detection, adversarial detection as well as instance and model explainability. Enterprise offering comes with feature distribution visualisations
Model Governance - model registry/catalogue Models built in flow saved as versions. Active and inactive versions and concept of rollbacks. AppStore is like a catalog of apps built from models and has permissions. Integration at project level between model building and hybrid cloud appstore is like registry - called Shared Model Repository. mlflow model registry with permissions and deployment history Model registry with versions, groups and associations to training metadata. Can enable cross-account deployment. Per-workspace registry and shareable across workspaces. Not an explicit concept of registry in docs but there is a models page in a vertex project, which is like a registry. No native model registry. Can log model details (artifacts) with metadata. Listing of running models available. Model registry which provides model lineage, versioning, stage transtions and annotations. Versioning and comparison of workflows. Workflows are the key unit rather than models in KNIME, though models can be exported to PMML if you want to deal with them outside KNIME. Deployment actions restricted by role. Deployment approvals. Deployment notifications and reports. Model registry with generation of compliance docs. Trained models and their artifacts can be versioned in Pachyderm and hosted with Pachyderm's S3 gateway to be integrated with deployment systems Seldon supports a variety of model stores with integration in to cloud stoarge providers. The enterprise offering comes with a model catalog together with supporting metadata.
Data Governance - lineage, audit, permissions Data catalog and GDPR plug-in Driverless AI lets you define who can see what data tracking, governance, acls, can use mlflow with delta lake integration for deep lineage Lineage tracking. Various encryption options. Data lineage tracking available. Encryption. Lineage through metadata on vertex pipelines (pre-GA Oct 2021). Permissions on feature store. Lineage can be recorded using metadata sdk. Pachyderm can be used with kubeflow. Enterprise data lineage with arrikto (not part of kubeflow open source). Reproducibility by tracking experiments and having the environment of a project. Artifacts can be logged as part of tracking. Transformations are node operations and a workflow history is kept. It doesn't natively track data lineage but there is a component to hook into histories. Lineage for data prep columns and features. Audit trail of user actions. Automatic data versioning and lineage through a Git-like system to show provenance Designed to play well with a wide variety of data governance tools with examples for tools such as pachyderm and dvc
Continuous Delivery - Auto-retraining, promotions, environments CI/CD pipelines Automatic retraining for driverless AI models. Dev-test-prod envs CICD templates, stages Payload capture/logging. Templates for CI/CD. Pipeline, CI and git integration. Payload logging to blob. Prediction logging and continuous evaluation possible but not yet well integrated under Vertex. Pipeline is a bit like CI Kubeflow Pipelines for workflow orchestration - needs to be invoked from CI. Als flexible request logging via kfserving or seldon. mlflow projects can be git repos so can hook CI into git to call an entry point. Promotions can be recorded in registry Triggered retraining for classifiers with model monitoring component. KNIME server has a REST API so deployments can be invoked from CI. Automated retraining. Write-back integrations to log predictions and responses to Snowflake, Tableau and BigQuery as well as native storage. CI integration, which can be setup to retrain. An end-to-end CD example is provided. Spout for streaming data. Enterprise offering includes gitops capability, allowing model deployments to be automated, versioned and reproduced. Also examples for upgrades and rollbacks, intelligent rollouts with iter8 and alerts based on monitoring
Collaboration Features - "workspaces", permissions Projects with permissions. Also history and rollbacks. Configurable user isolation across integrations. Projects and sharing in MLOps. Maps to other hybrid cloud products. for notebooks, models and spark Has projects idea - has to be configured as is not used by default. Has templates for using with git and CI/CD. Workspaces with permissions and security. Uses concept of projects and assigning permissions Namespaces and sharing permissions between namespaces Mlflow projects can be git repos so there's separaton and collaboration there. At server level there's not a concept of permissions in the open source Workflows can be parts of workflow groups. Permissions on workflows and groups with owners and delegation. Deployment permissions. Platform-wide roles and RBAC. Projects for building models. Access controls with identity provider integration. Console to facilitate collaboration on DAGs/pipelines and for debugging. Notebooks for collaboration on pipeline creation. Contexts for switching between clusters. Seldon Core uses Kubernetes namespaces for separation of workspaces. Seldon Deploy also comes with project based authorisation
Self-install Support - Bare Metal/VMs/containers Linux or AWS k8s for MLOps. h2o-3 part of stack can be run locally Typical installation is on your cloud account via integration with cloud providers. Allows for levels of control over cloud infra. Open source parts of the platform (e.g. spark, mlflow) can be installed by the user but that's a subset. Can train models in k8s with kubeflow. Could do this with other tools too but AWS are trying to support it. Only training though. Self-install on kubernetes is the default route, with some per-provider distribution customization. Can run on local machine or install server to run hosted Server can run on cloud or on-prem. Analytics Platform is a desktop app. Can be installed on-prem as a supported option Pachyderm Enterprise and Community editions both provide a self deployed option for local or kubernetes Self-install on Kubernetes with installers provided for enterprise offerings
Cloud/SaaS Support - Self-managed, SaaS etc. Dataiku Online is a new offering only for trials Option for 'customer-managed' SaaS as well as SaaS options on AWS, Azure or Google. SaaS SaaS SaaS Google Vertex has managed kubeflow pipelines but that is not the same as a managed kubeflow. There are distributions which hook into some managed platform features. Managed version from databricks Marketplace versions available of Server for Azure and AWS. Installation on preferred cloud platform or as managed service. Pachyderm Hub provides a managed Cloud version Self-managed on own infrastructure or any cloud provider.