Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to extract/save weights after training? #13

Open
minertom opened this issue Dec 6, 2020 · 2 comments
Open

How to extract/save weights after training? #13

minertom opened this issue Dec 6, 2020 · 2 comments

Comments

@minertom
Copy link

minertom commented Dec 6, 2020

OK, here I am displaying my utter ignorance again. I did find a post on towards data science entitled "everything you need to know about saving weights in pytorch".

https://towardsdatascience.com/everything-you-need-to-know-about-saving-weights-in-pytorch-572651f3f8de

Now I am stuck. Having saved the weights in the example project, I am aware that the file is not in a human readable format.

So my question now becomes is there a way to take this file of weights which is in pth format and convert it to numpy, which would be human readable? I would like to be able to do further manipulation of the weights in numpy.

Thank You for your patients
Tom

@dvgodoy
Copy link
Owner

dvgodoy commented Dec 6, 2020

Hi Tom,

The saving/load models are used to resume training or deployment, so they are saved in binary format, as they are not intended for being read by human.
The save method is actually transforming the state dictionary into its binary representation for saving. If you want to do anything else, either converting it to numpy arrays or into human-readable text, you can go over the dictionary itself.

For example, let's say you have a simple sequential model:
model = nn.Sequential(nn.Linear(2, 10), nn.ReLU(), nn.Linear(10, 1))

If you check its state dictionary, it goes as expected:
OrderedDict([('0.weight', tensor([[ 0.0485, 0.3305], [ 0.6338, 0.4103], ... [ 0.3358, -0.3827], [-0.4230, 0.2328]])), ('0.bias', tensor([ 0.2907, 0.3352, 0.1105, -0.6123, 0.2566, -0.4548, 0.4116, 0.4219, -0.4997, 0.0397])), ('2.weight', tensor([[-0.2709, 0.0192, 0.0961, -0.0101, -0.3044, 0.2777, 0.0432, 0.0935, -0.2234, -0.0936]])), ('2.bias', tensor([-0.2365]))])

You can also get the state dictionary of any given layer if you wish: model[2].state_dict() will return you only those weights
corresponding to the last layer.

They are all tensors, but you can make them all Numpy arrays:
state_dict = model.state_dict() dict_numpy = {k: state_dict[k].cpu().numpy() for k, v in state_dict.items()}

Or if you want to have them in plain text, you can use JSON:
text_state = json.dumps({k: state_dict[k].tolist() for k, v in state_dict.items()})

Does it help?

In Chapter 5 (which I will publish in a few days), I will introduce a method to visualize the filters (weights) of convolutional layers, and I will also introduce hooks, which you can use to capture the outputs produced by each layer. I think you'll like the next Chapter :-) I'll be looking forward to your feedback on it.

Best,
Daniel

@minertom
Copy link
Author

minertom commented Dec 7, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants