feat(cluster): initial commit for scale-out cluster #2041

rchincha · 2023-11-13T19:40:06Z

What type of PR is this?

Which issue does this PR fix:

What does this PR do / Why do we need it:

If an issue # is not available please add repro steps and logs showing the issue:

Testing done on this change:

Automation added to e2e:

Will this break upgrades or downgrades?

Does this PR introduce any user-facing change?:

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

rchincha · 2023-11-13T21:02:37Z

Proposal:

Single nodes have an upper limit on CPU, memory and storage. So eventually, have to grow out of single nodes.
Repository "name" based static routing.
Each node either can handle the request locally, or must route.
HTTP redirect based (3xx) or proxy?
Choose proxy in 4. since not clear if clients handle 3xx, however, there is an extra cost in proxy/networking.
But nice in the sense that we can expose all nodes via DNS or a stateless ingress can hide and spray to these nodes.
The cluster size can grow or shrink, which means the new lookup may not find data locally.
However, since content-based lookups, we can add the other nodes as an implicit "sync" rule for a particular "name". If the data is not found locally, "sync" will get it from other members, and along with "retention", the cluster re-balances over time.

Challenges:

List "all" repositories
Cross-repository queries
So graphQL has to account for this split state/view.

elee1766 · 2023-11-17T00:04:35Z

previously i have worked with seaweedfs, which allows for side-adjustable clusters with discovery similar to how you describe. they use raft for this: https://github.com/seaweedfs/seaweedfs/blob/master/weed/server/raft_server.go

possibly a similar approach can be taken. they also are a sharded hash-addressable blob store.

that said, what is the rationale to implement your own cluster system? i feel like this is something better solved separately, since as a user i would probably feel more comfortable using zot on top of redis/ceph instead of making zot do the replication/sharding.

Signed-off-by: Ramkumar Chinchani <rchincha@cisco.com>

adodon2go assigned rchincha Nov 14, 2023

rchincha mentioned this pull request Nov 15, 2023

feat: redis dedup backend #2005

Draft

rchincha mentioned this pull request Mar 26, 2024

clustering: zot scale-out cluster #125

Open

feat(cluster): initial commit for scale-out cluster

d9adfca

Signed-off-by: Ramkumar Chinchani <rchincha@cisco.com>

rchincha force-pushed the scale-out branch from d1d910f to d9adfca Compare April 3, 2024 18:47

vrajashkr mentioned this pull request Apr 12, 2024

feat(cluster): Add support for request proxying for scale out #2385

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cluster): initial commit for scale-out cluster #2041

feat(cluster): initial commit for scale-out cluster #2041

rchincha commented Nov 13, 2023

rchincha commented Nov 13, 2023 •

edited

elee1766 commented Nov 17, 2023 •

edited

feat(cluster): initial commit for scale-out cluster #2041

Are you sure you want to change the base?

feat(cluster): initial commit for scale-out cluster #2041

Conversation

rchincha commented Nov 13, 2023

rchincha commented Nov 13, 2023 • edited

elee1766 commented Nov 17, 2023 • edited

rchincha commented Nov 13, 2023 •

edited

elee1766 commented Nov 17, 2023 •

edited