Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to add a duplicate FQDN in provision when Overseer and FL servers specify the same FQDN. #777

Open
YuanTingHsieh opened this issue Aug 17, 2022 Discussed in #710 · 6 comments
Assignees

Comments

@YuanTingHsieh
Copy link
Collaborator

Discussed in #710

Originally posted by asus-ocis July 6, 2022
Hi all,

I want to deploy nvflare 2.1 on k8s. Overseer, server1 and server2 will run on different pods,
We have a HAproxy on k8s, so we need to set the same FQDN.

my project requirements:

Overseer : 
  fqdn : myk8s.com
  port: 7001
server1:
   fqdn: myk8s.com
   learn_port:70032
   admin_port:70033
server2:
   fqdn: myk8s.com
   learn_port:70055
   admin_port:70052   

But I got the following error after executing provison.
ValueError: Unable to add a duplicate name myk8s.com into this project.

Check spec.py

raise ValueError(f"Unable to add a duplicate name {p.name} into this project.")

The name field in the participants needs to be unique.

Is it possible to add FQDN field for Overseer and FL Server in project.yml in future versions? Or have any other suggestion?

@lyscho
Copy link

lyscho commented Sep 12, 2022

I am running into the same issue, as I want to setup Overseer and Server on the same FQDN. Is there any reason, why this is not enabled (or considered bad practice)?

@parkeraddison
Copy link

Also ran into this. For now one workaround is to set the participant name/fqdn to something fake like "server.hostname" and "overseer.hostname" then add the real IP mapping to the /etc/hosts file on each participant.

@YuanTingHsieh
Copy link
Collaborator Author

I am running into the same issue, as I want to setup Overseer and Server on the same FQDN. Is there any reason, why this is not enabled (or considered bad practice)?

@IsaacYangSLA and @chesterxgchen can you help answer this question? thanks!

@chesterxgchen
Copy link
Collaborator

@parkeraddison This is considered as Bad practice. The role of the oversee is switch the FL server to the health one. In case of there are two FL servers, one dead, if the Overseer is on the same host of FL Server. The Overseer could crash at the same time if the host hardware is no longer working.

@parkeraddison
Copy link

In case of there are two FL servers, one dead, if the Overseer is on the same host of FL Server. The Overseer could crash at the same time if the host hardware is no longer working.

I agree it's a bad practice to put the Server and Overseer on the same physical host -- but same FQDN doesn't necessary imply same host. E.g. if a load balancer is used in a cloud or k8s deployment, then the same FQDN can forward to different hosts depending on the port.

FWIW if we provision using the Dashboard then the same FQD can be used (just might help to rename for the files when you download them to avoid confusion).

Can't do this in a project.yaml because the service's name is its FQDN... could be worth separating the two fields like was proposed in the original post.

@chesterxgchen
Copy link
Collaborator

chesterxgchen commented Jan 30, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants