Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix the bug for eval function while variable_update=parameter_server|distributed_replicated #47

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

pan463194277
Copy link

firstly , the _eval function currently doesn't support the mode of 'variable_update=parameter_server' and 'variable_update=distributed_replicated' ,and there will be some mistakes while using the mode of 'replicated' to restore parameters from the checkpoint file that created by training with 'variable_update=parameter_server|distributed_replicated' ,so I changed the 'target' to fix it .

secondly ,while variable_update='distributed_replicated' ,the result of eval function looks not correct. I found that the set of tf.global_variables have no parameters while restoring checkpoint , and even in training ,tf.global_variables() only contained 190+ parameters(these parameters were copied from local_variables and only trainable variables) ,without 'batchnorm/gamma' ,'batchnorm/moving_mean' and 'batchnorm/moving_variance' ,so I changed the code to store/restore parammeters from/to the tf.local_variables and it worked.

@tfboyd tfboyd requested a review from reedwm August 18, 2017 03:37
@tfboyd
Copy link
Member

tfboyd commented Aug 18, 2017

@reedwm Can you take a look? I think you are dealing with this internally. I will merge internal to external to get a better version out here and would like to clean up these PRs first. Thank you.

freedomtan pushed a commit to freedomtan/benchmarks that referenced this pull request Apr 18, 2018
Merge internal changes into public repository (change 181251654)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants