You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
class Net(nn.Module):
def __init__(self, num_classes: int) -> None:
super(Net,` self).__init__()
self.model = models.resnet18()
for param in self.model.parameters():
param.requires_grad = False
self.model.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)
num_ftrs = self.model.fc.in_features
self.model.fc = nn.Linear(num_ftrs, num_classes)
summary(self.model, input_size=(1, 28, 28)) # <<== THIS LINE
def forward(self, x: torch.Tensor) -> torch.Tensor:
x = self.model(x)
return x
If I add summary(self.model, input_size=(1, 28, 28)) at the end of __init__() method, everything works. But when I remove it, I get error: input_param = input_param[0]IndexError: index 0 is out of bounds for dimension 0 with size 0 in evaluate_fn of server.py:
params_dict = zip(model.state_dict().keys(), parameters)
state_dict = OrderedDict({k: torch.Tensor(v) for k, v in params_dict})
model.load_state_dict(state_dict, strict=True) # <= At this line I'm getting error
When I remove line summary(self.model, input_size=(1, 28, 28)), I get following error:
[2024-04-08 09:43:34,760][flwr][INFO] - Initializing global parameters
[2024-04-08 09:43:34,761][flwr][INFO] - Requesting initial parameters from one random client
[2024-04-08 09:43:37,337][flwr][INFO] - Received initial parameters from one random client
[2024-04-08 09:43:37,338][flwr][INFO] - Evaluating initial parameters
[2024-04-08 09:43:37,644][flwr][ERROR] - index 0 is out of bounds for dimension 0 with size 0
[2024-04-08 09:43:37,646][flwr][ERROR] - Traceback (most recent call last):
File "/root/miniconda3/envs/flower_env/lib/python3.9/site-packages/flwr/simulation/app.py", line 308, in start_simulation
hist = run_fl(
File "/root/miniconda3/envs/flower_env/lib/python3.9/site-packages/flwr/server/app.py", line 225, in run_fl
hist = server.fit(num_rounds=config.num_rounds, timeout=config.round_timeout)
File "/root/miniconda3/envs/flower_env/lib/python3.9/site-packages/flwr/server/server.py", line 92, in fit
res = self.strategy.evaluate(0, parameters=self.parameters)
File "/root/miniconda3/envs/flower_env/lib/python3.9/site-packages/flwr/server/strategy/fedavg.py", line 165, in evaluate
eval_res = self.evaluate_fn(server_round, parameters_ndarrays, {})
File "/root/development/machine-learning-project/server.py", line 42, in evaluate_fn
model.load_state_dict(state_dict, strict=True)
File "/root/miniconda3/envs/flower_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1657, in load_state_dict
load(self, state_dict)
File "/root/miniconda3/envs/flower_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1645, in load
load(child, child_state_dict, child_prefix)
File "/root/miniconda3/envs/flower_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1645, in load
load(child, child_state_dict, child_prefix)
File "/root/miniconda3/envs/flower_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1639, in load
module._load_from_state_dict(
File "/root/miniconda3/envs/flower_env/lib/python3.9/site-packages/torch/nn/modules/batchnorm.py", line 110, in _load_from_state_dict
super(_NormBase, self)._load_from_state_dict(
File "/root/miniconda3/envs/flower_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1562, in _load_from_state_dict
input_param = input_param[0]
IndexError: index 0 is out of bounds for dimension 0 with size 0
[2024-04-08 09:43:37,648][flwr][ERROR] - Your simulation crashed :(. This could be because of several reasons. The most common are:
> Sometimes, issues in the simulation code itself can cause crashes. It's always a good idea to double-check your code for any potential bugs or inconsistencies that might be contributing to the problem. For example:
- You might be using a class attribute in your clients that hasn't been defined.
- There could be an incorrect method call to a 3rd party library (e.g., PyTorch).
- The return types of methods in your clients/strategies might be incorrect.
> Your system couldn't fit a single VirtualClient: try lowering `client_resources`.
> All the actors in your pool crashed. This could be because:
- You clients hit an out-of-memory (OOM) error and actors couldn't recover from it. Try launching your simulation with more generous `client_resources` setting (i.e. it seems {'num_cpus': 1, 'num_gpus': 0.0} is not enough for your run). Use fewer concurrent actors.
- You were running a multi-node simulation and all worker nodes disconnected. The head node might still be alive but cannot accommodate any actor with resources: {'num_cpus': 1, 'num_gpus': 0.0}.
Take a look at the Flower simulation examples for guidance <https://flower.dev/docs/framework/how-to-run-simulations.html>.
The text was updated successfully, but these errors were encountered:
Hi @EzyHow, have you added that summary(self.model, input_size=(1, 28, 28)) somewhere else? maybe also in the evaluation in server.py? I wonder if torchsummary is adding something to the state_dict...
I encountered the same issue and found a solution. I noticed the ndarrays_to_model function in src/model_utils.py. The relevant code is:
defndarrays_to_model(model: torch.nn.ModuleList, params: List[np.ndarray]):
"""Set model weights from a list of NumPy ndarrays."""params_dict=zip(model.state_dict().keys(), params)
state_dict=OrderedDict({k: torch.from_numpy(np.copy(v)) fork, vinparams_dict})
model.load_state_dict(state_dict, strict=True)
Describe the bug
I was trying a example project of Flower Simulation (Flower Simulation Step by Step Pytorch - Part II). Everything went very well until I tried to change the model to resnet18 as given below:
If I add
summary(self.model, input_size=(1, 28, 28))
at the end of__init__()
method, everything works. But when I remove it, I get error:input_param = input_param[0]
IndexError: index 0 is out of bounds for dimension 0 with size 0
inevaluate_fn
of server.py:Steps/Code to Reproduce
Clone the repository from Flower Simulation Step by Step Pytorch Part-II and follow instructions to setup the environment.
Then change the model to resnet18 in model.py file as given below:
Following is the list of packages installed in the conda environment:
requirement.txt file
Expected Results
Following is the output when it runs successfully (by adding line
summary(self.model, input_size=(1, 28, 28))
) :{'history': History (loss, distributed): round 1: 6.738090056180954 round 2: 3.8934330970048903 History (loss, centralized): round 0: 366.1482033729553 round 1: 97.4027541577816 round 2: 52.76616382226348 History (metrics, centralized): {'accuracy': [(0, 0.1086), (1, 0.8021), (2, 0.8959)]}
Actual Results
When I remove line
summary(self.model, input_size=(1, 28, 28))
, I get following error:The text was updated successfully, but these errors were encountered: