Skip to content

OpenCL and TensorRT Bugfixes

Compare
Choose a tag to compare
@lightvector lightvector released this 22 Jan 14:26
· 448 commits to master since this release

This release is not the latest release, see newer release v1.12.4!

New Neural Net Architecture Support (release series v1.12.x)

The same as prior releases in the v1.12.x series, this release KataGo has recently-added support a new neural net architecture! See the release notes for v1.12.0 for details! The new neural net, "b18c384nbt" is also attached in this release for convenience, for general analysis use it should be similar in quality to recent 60-block models, but run significantly faster due to being a smaller net.

What's Changed in v1.12.3

This specific release v.1.12.3 fixes a few additional bugs in KataGo:

  • Fixes performance regression for some GPUs on TensorRT that was introduced along with v.1.12.x (thanks @hyln9 !) (#741)
  • Mitigates a long-standing performance bug on OpenCL, where on GPUs that used dynamic boost or dynamic clock speeds, the GPU tuner would not get accurate timings due to variable GPU clock speed, most notably on a few users machines causing the tuner to fail to select FP16 tensor cores even when the GPU supported them and they would be much better performance. Most users will not see an improvement, but a few may see a large improvement. The fix is to add some additional computation to the GPU during tuning so that it is less likely to reduce its clock speed. (#743)
  • Fixes an issue where depending on settings, in GTP or analysis Katago might fail to treat two consecutive passes as ending the game within its search tree.
  • Fixes an issue in the pytorch training code that prevented models from being easily trained on variable tensor sizes (i.e. max board sizes) in the data.
  • Contribute command in OpenCL will now also pretune for the new b18c384nbt architecture the same way pretunes for all other models.