Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

“evaluate_fn” appearance of test results nan issues #3161

Open
CHENxx23 opened this issue Mar 20, 2024 · 1 comment
Open

“evaluate_fn” appearance of test results nan issues #3161

CHENxx23 opened this issue Mar 20, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@CHENxx23
Copy link

Describe the bug

When testing in a server-side environment, the test results appear nan, not in the first round, but in the second round, why is this?

Steps/Code to Reproduce

def get_evaluate_fn(exp, model):
def evaluate_fn(server_round, parameters, config):
# set parameters to the model
params_dict = zip(model.state_dict().keys(), parameters)
state_dict = OrderedDict({k: torch.Tensor(v) for k, v in params_dict})
model.load_state_dict(state_dict, strict=True)
loss, metrics = exp.test(model, test=1)
return loss, metrics
return evaluate_fn

strategy = fl.server.strategy.FedAvg(
    fraction_fit=args.fraction,
    fraction_evaluate=args.fraction,
    min_fit_clients=int(args.num_clients * args.fraction),
    min_evaluate_clients=int(args.num_clients * args.fraction),
    min_available_clients=int(args.num_clients * args.fraction),
    initial_parameters=fl.common.ndarrays_to_parameters(get_parameters(model)),
    evaluate_fn=get_evaluate_fn(exp, model),
    )

def test(self, model, test=0):
    test_data, test_loader = self._get_data(flag='test')
    train_data, train_loader = self._get_data(flag='train')
    test_steps = len(train_loader)
    if test:
        print('loading model')
        self.model = model

Expected Results

The test results should not show nan

Actual Results

DEBUG flwr 2024-03-19 23:15:41,788 | server.py:236 | fit_round 1 received 3 results and 0 failures
WARNING flwr 2024-03-19 23:15:44,483 | fedavg.py:242 | No fit_metrics_aggregation_fn provided
test 2785
train 7825
loading model
mse:nan, mae:nan, rse:nan
43it [00:23, 1.81it/s]
INFO flwr 2024-03-19 23:16:41,187 | server.py:125 | fit progress: (1, nan, {'MAE': nan, 'MSE': nan, 'RMSE': nan}, 4701.443124711048)
DEBUG flwr 2024-03-19 23:16:41,188 | server.py:173 | evaluate_round 1: strategy sampled 3 clients (out of 30)
DEBUG flwr 2024-03-19 23:19:05,097 | server.py:187 | evaluate_round 1 received 3 results and 0 failures
WARNING flwr 2024-03-19 23:19:05,098 | fedavg.py:273 | No evaluate_metrics_aggregation_fn provided
INFO flwr 2024-03-19 23:19:05,098 | server.py:153 | FL finished in 4845.353513141978
INFO flwr 2024-03-19 23:19:05,284 | app.py:226 | app_fit: losses_distributed [(1, nan)]
INFO flwr 2024-03-19 23:19:05,285 | app.py:227 | app_fit: metrics_distributed_fit {}
INFO flwr 2024-03-19 23:19:05,285 | app.py:228 | app_fit: metrics_distributed {}
INFO flwr 2024-03-19 23:19:05,285 | app.py:229 | app_fit: losses_centralized [(0, 0.7424132643744003), (1, nan)]
INFO flwr 2024-03-19 23:19:05,285 | app.py:230 | app_fit: metrics_centralized {'MAE': [(0, 0.59392285), (1, nan)], 'MSE': [(0, 0.742413), (1, nan)], 'RMSE': [(0, 0.8616339), (1, nan)]}

@CHENxx23 CHENxx23 added the bug Something isn't working label Mar 20, 2024
@jafermarq
Copy link
Contributor

Hi @CHENxx23 , i think this might be related to the issue i refer to in my answer to your other question here: #3164

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants