Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

design(hld): high level proposal for moveEngine CR #22

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
152 changes: 152 additions & 0 deletions keps/0002-kep-moveengine-hld.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
---
kep-number: 1
title: MoveEngine - High Level Design
authors:
- "@vitta"
owners:
- Uma Mukkara
- Mayank
editor:
creation-date: 2019-05-16
last-updated: 2019-05-19
status: provisional
see-also:
- KEP-None
replaces:
- KEP-None
superseded-by:
- KEP-None
---

# KubeMove - High Level Design



## Table of Contents

* [Summary](#summary)
* [Motivation](#motivation)
* [Goals](#goals)
* [Non-Goals](#non-goals)
* [Proposal](#proposal)
* [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints)
* [Risks and Mitigations](#risks-and-mitigations)
* [Design acceptance criteria](#design-acceptance-criteria)
* [Implementation History](#implementation-history)

## Summary

From the KubeMove HLD KEP, KubeMoveEngine CRD defines the placeholders for the workflow for a given application. The CR is created by the user by invoking the spec that has the following details
- Application to which this CR applies
- Target location including the namespace. User can specify multiple target locations, in which case multiple KMovePair CRs get created.
- The Dynamic Data Mobilizer or DDM details
- schedule time to transfer data at regular interval
- Telemetry management

`MoveEngine` controller that is watching this CR will deploy the KubeMove framework(`MoveEngine`, `DataSync` controller) at remote cluster to

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By deploy do you mean, MVEC will install the kubemove operator on the remote cluster or it will just create CRs assuming the operator is pre-deployed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This KEP-2 is specific to MoveEngine CR, which is one of various CRs of KubeMove.
Summary content till the spec is copied from KEP-1. It needs correction. Will do the same and update this PR.

setup the application and transfer the data to remote cluster. This controller creates the datasync CR periodically and invokes the `DataSync`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This controller creates the datasync CR periodically

in the source or destination cluster?

controller to initiate the data transfer through DDM. Once DDM completes the data transfer, `MoveEngine` controller updates the remote `DataSync` controller by creating dataSync CR at remote cluster to validate/verify the data transfer, and to perform the restore of data.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by creating dataSync CR at remote cluster - bit confused here. Is it the same DS cr as before?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once DDM completes the data transfer

Does it mean one "run" of the periodic sync?


## Motivation

This KEP is to get into details of MoveEngine CR

### Goals

This KEP is to get into details of MoveEngine CR

### Non-Goals

## Proposal

`spec` of MoveEngine CR consists of following fields:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "Move" operator could use human-in-the-loop model for Approve/Deny. It can be similar to CertificateSigningRequest in k8s. https://kubernetes.io/docs/tasks/tls/managing-tls-in-a-cluster/#create-a-certificate-signing-request-object-to-send-to-the-kubernetes-api

In that case MoveRequest could the more appropriate.

```
# remoteCluster details
movePair: cluster-2

#namespaces which needs to be migrated, empty for default, all for all namespace
namespace: ns1

# targeted namespace, empty for source namespace
remoteNamespace: ns2

#label selectors
selectors:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From our Stash experience, we found that it was better to use 1-1 mapping for MoveEngine CR and k8s workload.

In Stash v1alpha1, we use label selector based approach. It caused a number of problems:

  • When do you resolve the labels? This causes a lot of complexity and confusing for the user.
  • What if different types of apps have the same labels?
  • When keeping track of .status, it is easier to have one workload to deal with instead of dynamic number of workloads.
  • Also, users want to move one app at a time. So, separate CRs are easier.

So, in our Stash V1beta1, we switched to use ref instead labels selectors.https://github.com/stashed/stash/blob/master/apis/stash/v1beta1/types.go#L46-L88

Later you can introduce a MoveEngineBlueprint type non-namespaced CR so that if users have multiple workloads that need moving they can use the MoveEngineBlueprint as a template.

- app.example.com=test

# sync Period
syncPeriod: */5 * * * *

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does it end syncing?


# moveEngine mode, value can be either Backup or Restore
# Restore: if current engine is meant for restore purpose
# In case of restore, there should be only one moveEngine
# CR, with Restore mode, for given namespace and selectors.
# Backup : if current engine is meant for backup purpose
mode: Restore

# name of the plugin to be invoked for data transfer of volume
# this plugin will setup the necessary infrastructure ,
# at source and/or remote, to transfer the app and/or data.
pluginProvider: PLUGIN_NAME

# includeResources defines if volumes or cluster
# resource needs to be synced or not
includeResources: true
status:
#syncing stage
stage: Final

#status of last sync
lastStatus: Completed

# last sync time
lastSync: dd-mm-yyyy hh:mm:ss
volumes:
- namespace: ns1
persistentVolumeClaim: claim1
lastStatus: Completed
lastSynced: dd-mm-yyyy hh:mm:ss
reason: None
volume: xyz
remoteVolume: xyz1
resources:
kind: secret
name: app-key
phase: Created
status: Completed
reason: reason for failure
lastSyncedTime: last synced time
```
Controller does the following:
1. Check the network connectivity to cluster in movePair CR

2. Discover the resources based on label selector
TODO: Fill the details regarding discovery of the resources

3. Deploy the resources on destination cluster
- Create clientset of destination cluster
- Transform the filtered resources. TODO: Fill more details
- From source cluster, deploy transformed resources using above clientset in destination cluster

Another approach is to
- Install object store in destination cluster
- Send the filtered resources to object storage
- Transform the resources in object store at destination
- From destination cluster, deploy transformed resources

### Implementation Details/Notes/Constraints

TODO

### Risks and Mitigation


## Design acceptance criteria

This KEP is in provisional state. To move the design into `accepted` state, at least two sponsors are needed as design approvers from two different companies.

## Implementation History

- None