Skip to content

Releases: lightvector/KataGo

Bigger board sizes

25 Aug 23:07
Compare
Choose a tag to compare
Bigger board sizes Pre-release
Pre-release

This is a non-regular side-release just for fun with precompiled executables for board sizes up to 29x29. However they will use more RAM and possibly be a little slower even when used to play 19x19 or smaller boards. So for best performance, one should still prefer the normal release.

The actual latest release (mostly bugfixes) is here: https://github.com/lightvector/KataGo/releases/tag/v1.6.1
And the latest major release with many release notes is here: https://github.com/lightvector/KataGo/releases/tag/v1.6.0
And the latest and strongest neural nets are still those from here: https://github.com/lightvector/KataGo/releases/tag/v1.4.5

Eigen (CPU) version and Other Improvements

23 Aug 19:42
Compare
Choose a tag to compare

If you're a new user, don't forget to check out this section for getting started and basic usage!
The latest and strongest neural nets are still those from the former release: https://github.com/lightvector/KataGo/releases/tag/v1.4.5

KataGo has now improved its Eigen implementation, making for what is now a reasonably decently-optimized pure-CPU version! It will of course still be much slower than with a good GPU, but particularly for smaller nets (20 blocks, or 15 blocks) should often get from 5 to 20 playouts per second. All of these versions are available as pre-compiled executables in this release now.

Versions available

  • OpenCL - Use this if you have a modern GPU.
    This continues to be the general GPU version of KataGo, should work on a variety of GPUs, although older GPUs not from the last few years may not work, and AMD and minor vendors often have driver issues in their OpenCL implementations.

  • CUDA - Test this if have a top-end NVIDIA GPU, are willing to do some more technical setup work, and care about getting every bit of performance.
    Requires an NVIDIA GPU and requires installing CUDA 10.2 (not CUDA 11 yet) and CUDNN from NVIDIA. For most users, there is little reason to use this version, often the OpenCL version will be faster even on NVIDIA's own GPUs! The CUDA version may be a little faster for some very top-end GPUs that have FP16 tensor cores. But even then not always, so you should benchmark the difference to see in practice on your specific hardware.

  • Eigen AVX2 - Use this if you don't have a GPU or your GPU is too old to work, but you have an Intel or AMD CPU from the last several years.
    This is a pure CPU version of KataGo, but compiled to use AVX2 and FMA operations, which roughly will double the speed compared to not using them. However, it will completely fail to run on older or weaker CPUs that don't support these operations.

  • Eigen - Use this if you don't have a GPU or your GPU is too old to work, and your CPU turns out not to support AVX2 or FMA.
    This is the pure CPU version of KataGo, with no special instructions, which should hopefully run just about anywhere.

Major New Stuff This Release:

Performance

  • Massive optimizations for the Eigen implementation thanks to kaorahi, making it now usable.
  • Reduced OpenCL code overhead, which may make it able to run on a small number of older GPUs where it couldn't before.
  • Worked around major OpenCL issue with NVIDIA GPUs that prevented it from using more than one GPU effectively. Now it should scale on a multi-GPU machine, whereas previously it didn't at all.

For Analysis Tool Devs

  • Implemented allow and avoid options for both the json analysis engine and for GTP lz-analyze and kata-analyze, whose precise semantics should be documented in these links. These options allow restricting the search down to only specific moves or specific regions of the board. I'm not entirely sure if they match Leela Zero's semantics, since I could not find any precise specification for them beyond the raw source code and scattered descriptions in github issues.

  • Added pvVisits option for both the json analysis engine and for GTP lz-analyze and kata-analyze. This option causes KataGo to also report the number of visits for every move in any of the principal variations for different moves. These values might be useful for estimating or for informing users about the reliability of the moves as you get deeper into a variation.

  • Improved some of the logging options available for the analysis engine. The new options are in https://github.com/lightvector/KataGo/blob/master/cpp/configs/analysis_example.cfg, commented out with their default values.

Other

  • Fixed some interface-related bugs and made a variety of changes in the build config to produce more friendly messages and hints in CMakeGUI.

OpenCL FP16 Tensor Core Support

02 Aug 21:02
Compare
Choose a tag to compare

If you're a new user, don't forget to check out this section for getting started and basic usage!
The latest and strongest neural nets are still those from the former release: https://github.com/lightvector/KataGo/releases/tag/v1.4.5

Changes in this release:

OpenCL FP16 Tensor Cores

New in this release is support for FP16 tensor core GPUs in OpenCL, roughly doubling performance. Theoretically, non-tensor core GPUs that gain significant improvements via FP16 storage or compute may also see a benefit under this release. If you are upgrading from an earlier version of KataGo, the OpenCL tuner will need to re-run to re-tune itself.

The OpenCL FP16 implementation is still a little slower than the CUDA implementation on an FP16 tensor core GPU, so if you've gone through the hassle of installing CUDA and getting it to work on such a GPU, there is not a reason to switch to OpenCL, but now for users who can get OpenCL but not CUDA+CUDNN to work, the gap should be much smaller than before. Further optimization may be possible in the future, any GPU code experts are of course welcome to comment. :)

Other user-facing changes

  • New GTP extension command: set_position which allows a GTP controller to directly set an arbitrary position on the board, rather than hacking it via a series of "play" commands which might accidentally communicate an absurd move history. See documentation for KataGo GTP extensions here as usual.
  • By default, if absolutely no limits or time settings are specified for KataGo, and the GUI or tournament controller running it does not specify a time control either, KataGo will choose a small default of several seconds rather than treating time as unbounded.
  • Added a minor bit of logic for handling mirror Go. Nothing particularly robust or special, won't solve extreme cases, but hopefully fun.
  • Minor adjustments for detecting handicap stones for the purpose of computing PDA and/or when to resign.
  • Benchmark auto-tuning for number of threads is a little more efficient

Self-play

  • Hash-like game ID is now written to selfplay-generated SGFs.
  • Fixes a very rare bug in self-play game forking and initialization that could cause incorrect resolution of move legality as well as apparent neural net hash collisions upon the transition to cleanup phase for Japanese-like territory scoring rules.

Internal

  • Symmetries are now computed on the CPU rather than the GPU, simplifying GPU code a little.
  • A few internal performance optimizations and cleanups, partly thanks to some contributors.

Pure CPU implementation

Also as of this release, there is a pure-CPU implementation which can be compiled via -DUSE_BACKEND=EIGEN for cmake. There are no precompiled executables for it right now because the implementation is very basic and the performance is extremely poor - even worse than one would expect from CPU. So practically speaking, it's not ready for use. However, it's a start, hopefully, and contributors who want to help optimize it would be welcome. :)

Final June 2020 Neural Nets, Minor Bugfixes

21 Jun 17:08
Compare
Choose a tag to compare

If you're a new user, don't forget to check out this section for getting started and basic usage!

KataGo's third major run is complete! Almost certainly we could keep going further and continue improving with no end in sight, but for now due to the cost of continuing the run, this seems like a good point to stop. In the future, there is a chance that KataGo will launch a crowdsourced community-distributed run to continue further with more research and improvement. But regardless, I hope you enjoy the run so far and these final networks.

This is both a release of the final networks, as well as an update to KataGo's code for a few minor bugfixes.

New/Final Neural Networks (for now!)

These are the final neural networks for the June 2020 ("g170") run, obtained after training for close to 2 weeks at reduced learning rates. This resulted in a huge strength boost, somewhere from 200 to 250 Elo for both the 30 and 40 block networks, and around 100 Elo for the 20 block network.

These gains were measured by play in a pool of older KataGo networks - it's unknown what proportion of these gains transfer to opponents other than just KataGo, gains due to learning rate drops (presumably just reducing noise and gaining in overall accuracy) might be qualitatively different than gains over time from learning new shapes and moves. But hopefully much of it does.

  • g170-b30c320x2-s4824661760-d1229536699 ("g170 30 block d1229M") - Final 30 block network!
  • g170-b40c256x2-s5095420928-d1229425124 ("g170 40 block d1229M") - Final 40 block network!
  • g170e-b20c256x2-s5303129600-d1228401921 ("g170e 20 block d1228M") - Final 20 block network!

Additionally, posted here is an extremely fat and heavy neural net, 40 blocks with 384 channels instead of 256 channels, which has never been tested (scroll to the bottom, download and unzip the file to find the .bin.gz file).

It is probably quite slow to run and likely weaker given equal compute time. But it would be very interesting to try and see how its per-playout strength compares, as well as its one-playout strength (pure raw policy) in case anyone wants to test it out!

Which Network Should I Use?

  • For weaker or mid-range GPUs, try the final 20-block network.
  • For top-tier GPUs and/or for the highest-quality analysis if you're going to use many thousands and thousands of playouts and long thinking times, try the final 40-block network, which is more costly to run but should be the strongest and best.
  • If you care a lot about theoretical purity - no outside data, bot learns strictly on its own - use the 20 or 40 block nets from this release, which are pure in this way and still much stronger than Leela Zero, but also not quite as strong as these final nets here.
  • If you want some nets that are much faster to run, and each with their own interesting style of play due to their unique stages of learning, try any of the "b10c128" or "b15c192" Extended Training Nets here which are 10 block and 15 block networks from earlier in the run that are much weaker but still pro-level-and-beyond.
  • And if you want to see how a super ultra large/slow network performs that nobody has tested until now, try the fat 40-block 384 channel network mentioned a little up above.

Bugfixes this Release

  • Fixed a bug in analysis_example.cfg where nnMaxBatchSize was duplicated, and added a safeguard in KataGo to fail if fed any config with duplicate parameters in the future, instead of silently using one of them and ignoring the other.
    • If you have a config with a buggy duplicate parameter, you may find KataGo failing when switching to this release - please just remove the duplicate parameter and set it to what it should be if the two values for that parameter were inconsistent/conflicting.
  • Split up one of the OpenCL kernels into a few pieces to make compiling it faster, and also made a minor tweak, so that on most systems the OpenCL tuner will take a little less long.
  • katago match will now size the neural net according to the largest board size involved in the match by default, instead of always 19. This should make it faster to run test games on small boards.

Various bugfixes, CUDA 10.2

14 Jun 08:01
Compare
Choose a tag to compare

If you're a new user, don't forget to check out this section for getting started and basic usage!

This is a release that fixes a variety of issues and upgrades some things. See also the releases page for later releases with the latest neural nets.

This was originally released as version 1.4.3 except with a major bug for OpenCL due to a typo in the logic in implementing the new multi-device handling. This version 1.4.4 should hopefully have that bug fixed.

Enjoy!

Changes

Minor UI changes and features

  • For GTP and other commands, you can now specify an empty string for logFile to disable logging. As with almost all other config parameters, it may also be specified on the command line with -override-config.
  • For GTP and other commands, you can now specify homeDataDir=<DIR> to override the directory where KataGo will cache some data (currently, the OpenCL tuner files on the OpenCL version). As with almost all other config parameters, it may also be specified on the command line with -override-config.
  • The benchmark command now runs in -tune mode by default.

Precompiled executables

  • The precompiled CUDA executables are now compiled against CUDA 10.2 instead of 10.1.
  • The precompiled linux executables are compiled against libzip5 instead of libzip4.

Bugfixes

  • Fixed a problem that was preventing KataGo from running on multiple OpenCL GPUs or other devices if those devices were of two distinct vendors/platforms at the same time.
  • Fixed a bug where the cputime GTP command cleared on every game rather than accumulating time persistently.
  • Fixed a bug in analysis engine where reportAnalysisWinratesAs was ignored in many cases.
  • Fixed a bug where in some ways of piping input to KataGo, closing the pipe or closing stdin would be incorrectly handled and/or duplicate the final command.
  • Fixed a bug in selfplay SGF starting position processing where board size in the SGF was mistakenly ignored.
  • Fixed a typo in selfplay scripts that caused bad/confusing behavior when an incorrect number of arguments is provided.

Other cleanups

  • Various code cleanups
  • Clarified that models included directly in the repo source itself are tiny testing neural nets. See some of the releases for actual neural nets, or here for all the latest nets.

Various bugfixes, CUDA 10.2 (edit: buggy)

14 Jun 02:12
Compare
Choose a tag to compare

Edit (2020-06-14): The initial posting of this release of this was broken for OpenCL due to a typo in the logic in implementing the new multi-device handling. Reuploaded new executables what is hopefully a quick fix, and rereleased as version 1.4.4.

Experimental Neural Nets

06 Jun 02:46
Compare
Choose a tag to compare
Pre-release

This is not intended to be a release of the main KataGo program, but rather just an update to the neural nets. The latest release of the code and executables can be found here.

New Nets

Uploaded here are some new experimentally-trained neural nets! These neural nets have been trained using some amount of human and other external data (varyingly 5%-10%) as initial starting positions or "hint" positions for self-play games (in which an potential unexpected good move is guaranteed to be noised and searched more deeply). So still all data is generated by self-play and rather than taken from outside, but the positions (and rare "hints") in such games are may come from positions well outside the normal distribution of positions that the net would see if self-playing the entire game from the empty board.

Additionally, they have been trained to hopefully understand Mi Yuting's flying dagger joseki significantly better - although understanding may still of course be imperfect due to the immense complexity of the joseki.

These nets are not necessarily stronger than the nets bundled with the v1.4.0 release, which were the final and strongest non-external-data-biased nets.

As measured by pure self-play, the new nets may even be slightly weaker in some cases than the previous nets, perhaps due to having to "spend effort" to learn more kinds of shapes that didn't often come up in matches only against itself. However, one hopes that they on average may handle some new kinds of positions better and/or generalize against other opponents better. But that remains to be tested!

Many more than just the three new nets attached to GitHub here have been uploaded.
They can be found at the usual KataGo g170 download site. These are intermediate versions sampled from between the v1.4.0-bundled final nets and the ones attached and are described in the readme in case you are interested in testing across intermediate versions and seeing how the introduction of successive kinds of external data has progressively affected the policy and evaluation of specific positions.

Enjoy!

Major PDA/pondering bugfixes, fixed pondering limits

12 May 23:16
Compare
Choose a tag to compare

If you're upgrading from a version before v1.4.0, please see the v1.4.0 releases page for a variety of notes about the changes since v1.3.x that are new that you might care about, as well as the latest and strongest neural nets!

If you're a new user, don't forget to check out this section for getting started and basic usage!

This is a quick release to fix a few bugs, including one pretty major bug for certain configurations.

Changes

  • Fixed a bug introduced earlier in v1.4.x where in handicap games, or in even games where playoutDoublingAdvantage was set to a nonzero value, if pondering was also enabled at the same time, the search could sometimes contain values with different signs, greatly damaging the quality and strength of the search.

  • Fixed a bug introduced earlier in v1.4.x where analysis with nonzero playoutDoublingAdvantage would use the wrong sign.

  • For maxTimePondering, maxPlayoutsPondering, maxVisitsPondering, KataGo will now assume no limit (unbounded) if these values are not specified, instead of defaulting to using maxTime, maxPlayouts, maxVisits, respectively.

Minor bugfixes, logging to directory

10 May 20:27
Compare
Choose a tag to compare

This release is outdated, see releases page for more recent versions with further bugfixes. But also, if you're upgrading from a version before v1.4.0, please see the v1.4.0 releases page for a variety of notes about the changes since v1.3.x that are new that you might care about, as well as the latest and strongest neural nets!

This is a quick release to fix a minor bug and slightly improve log management.

Changes

  • Fixed a bug where if playoutDoublingAdvantage was manually specified in the config as a nonzero value at the same time pondering was enabled, then tree reuse would not occur, or be ineffective.
  • The GTP and analysis engines can now log dated files to a directory instead of just single file, via logDir=DIRECTORY instead of logFile=FILE.log. The default and example configs provided with this release have been updated to do so.

New Neural Nets, Optional Wider Analysis, Bugfixes

09 May 21:53
Compare
Choose a tag to compare

This release is outdated, see releases page for more recent versions with important bugfixes. But also, if you're upgrading from a version before v1.4.0, see below for a variety of notes about the changes since 1.3.x! Also, as of early May 2020 the latest and strongest neural nets are still the ones here.

If you're a new user, don't forget to check out this section for getting started and basic usage!

This time, we have both new nets and new code!

New Neural Nets!

These are all the new strongest neural net of each size so far. Interestingly, the 40 block net seems to have pulled well ahead in how much it improved this time (about 70 Elo), while the 30 block did not make nearly the same improvement (about 25 Elo). Perhaps the 40 block net got lucky in its gradient descent and the neural net stumbled into learning something useful that the other net didn't. The 20 block net gained maybe around 15 Elo. All of these differences have an uncertainty window of +/- 15 Elo or so (95% confidence), and of course these differences are based on a large varied pool of internal KataGo nets, so might vary a bit against very different opponents.

The strongest net to use for weak to midrange hardware is likely going to be the 20 block net. With the large gain of the 40 block net though, the 40 block net might be stronger for strong hardware and/or long time controls.

KataGo's main run is close to wrapping up so these will likely be the last "semi-zero" neural nets released, that is, nets trained purely with no outside data. A few more nets will be released after these as KataGo finishes the end of this run with some experimentation with ways of using outside data.

  • g170-b30c320x2-s3530176512-d968463914 - The latest and final semi-zero 30-block net.
  • g170-b40c256x2-s3708042240-d967973220 - The latest and final semi-zero 40-block net.
  • g170e-b20c256x2-s4384473088-d968438914 - The latest and final semi-zero 20-block net (continuing extended training on games from the bigger nets).

New Feature and Changes this Release:

  • New experimental config option to help analysis: analysisWideRootNoise

    • Set to a small value like 0.04 to make KataGo broaden its search at the root during analysis (such as in Sabaki or Lizzie) and evaluate more moves, to make it easier to see KataGo's initial impressions of more moves, although at the cost of needing some more time before the top moves get searched as deeply.
    • Or set to a large value like 1 to make KataGo search and evaluate almost every move on the board a bunch.
    • You can also change this value at runtime in the GTP console via kata-set-param analysisWideRootNoise VALUE.
    • Only affects analysis, does NOT affect play (e.g. genmove).
  • KataGo will now tolerate model files that have been renamed to just ".gz" rather than one of ".bin.gz" or ".txt.gz".

  • Implemented cputime and gomill-cpu_time GTP commands, documented here, which should enable some automated match/tournament scripts to now compare and report the time taken by the bot if you are running KataGo in tests against other bots or other versions of itself.

  • EDIT (2015-05-12) (accidentally omitted in the initial release notes): Reworked the way KataGo's configures playoutDoublingAdvantage and dynamicPlayoutDoublingAdvantage, and slightly improved how it computes the initial lead in handicap games.

    • You can now simply comment out all playoutDoublingAdvantage-related values and KataGo will choose a sensible default, which is to play evenly in even games, and to play aggressively when giving handicap stones, and safely when receiving handicap stones.
    • The default config has been updated accordingly, and you can also read the new config to see how to configure these values going forward if you prefer a non-default behavior.

Bugfixes

  • Added workaround logic to correctly handle rules-based score adjustment in handicap games (e.g. +1 point per handicap stone in Chinese rules) when handicap is placed in a non-GTP-compliant way, via consecutive black moves and white passes. This behavior can still be disabled via assumeMultipleStartingBlackMovesAreHandicap = false.

  • Fixed bug where adjusting the system clock or time zone might interfere with the amount of time KataGo searches for, on some systems.

For Devs

  • Added a script to better support for synchronous training, documented here.

  • Added various new options and flags for the JSON analysis engine, including root info, raw policy, and the ability to override almost any search-related config parameter at runtime. Analysis engine now also defaults to finishing all tasks before quitting when stdin is closed instead of dropping them, although a command line flag can override this.

  • Reorganized the selfplay-related configs into a subdirectory within cpp/configs, along with some internal changes and cleanups to selfplay config parameters and logic. The example configs have been updated, you can diff them to see the relevant changes.

  • num_games_total, which used to behave buggily and unreliably, has now been entirely removed from the selfplay config file, and instead a command line argument so as to be much more easily changeable by a script:
    -max-games-total .

Enjoy!