Under sync training way，how to sovle the problem that large-batch leads to the worse generalization #644

teki1981 · 2023-01-18T08:22:59Z

This template is for miscellaneous issues not covered by the other issue categories.

For questions on how to work with TensorFlow, or support for problems that are not verified bugs in TensorFlow, please go to StackOverflow.

If you are reporting a vulnerability, please use the dedicated reporting process.

For high-level discussions about TensorFlow, please post to discuss@tensorflow.org, for questions about the development or internal workings of TensorFlow, or if you would like to know how to contribute to TensorFlow, please post to developers@tensorflow.org.

StevenShi-23 · 2023-04-17T07:03:52Z

Empirically speaking, large batch training does usually lead to worse generalization due to sharp local minima (ref: https://openreview.net/forum?id=H1oyRlYgg). You may wish to use large batch optimizers like LAMB/LARS to alleviate this problem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Under sync training way，how to sovle the problem that large-batch leads to the worse generalization #644

Under sync training way，how to sovle the problem that large-batch leads to the worse generalization #644

teki1981 commented Jan 18, 2023

StevenShi-23 commented Apr 17, 2023 •

edited

Under sync training way，how to sovle the problem that large-batch leads to the worse generalization #644

Under sync training way，how to sovle the problem that large-batch leads to the worse generalization #644

Comments

teki1981 commented Jan 18, 2023

StevenShi-23 commented Apr 17, 2023 • edited

StevenShi-23 commented Apr 17, 2023 •

edited