Skip to content

Releases: lightvector/KataGo

New Neural Net Architecture!

08 Jan 04:43
Compare
Choose a tag to compare

This release is not the latest release, see newer release v1.12.4!

For TensorRT users only, this release contains a bug, the new net may compute incorrect values. Additionally, the OpenCL version may tune poorly and experience some errors or performance issues, upgrading to the newest release is recommended.

If you're a new user, don't forget to check out this section for getting started and basic usage! If you don't know which version to choose (OpenCL, CUDA, TensorRT, Eigen, Eigen AVX2), read this: https://github.com/lightvector/KataGo#opencl-vs-cuda-vs-tensorrt-vs-eigen

Also, KataGo is continuing to improve at https://katagotraining.org/ and if you'd like to donate your spare GPU cycles and support it, it could use your help there!

As before, attached here are "bs29" versions of KataGo. These are just for fun, and don't support distributed training but DO support board sizes up to 29x29. They may also be slower and will use much more memory, even when only playing on 19x19, so you should use them only when you really want to try large boards.

The Linux executables were compiled on an old 18.04 Ubuntu machine. As with older releases, they might not work, and it may be more reliable to build KataGo from source yourself, which fortunately is usually not so hard on Linux (https://github.com/lightvector/KataGo/blob/master/Compiling.md).

New Neural Net Architecture

This version of KataGo adds support for a new and improved neural net architecture!

The new neural nets use a new nested residual bottleneck structure, along with other major improvements in training. They train faster than KataGo's old nets and learn more effectively. Attached to this release is a one-off net b18c384nbt-uec.bin.gz that was trained for a tournament in 2022, which should be of similar strength to the 60-block nets on http://katagotraining.org/, but on many machines will run much faster, on some machines between 40-block and 60-block speed, but on some machines even as fast as or faster than 40-block.

The training code has been all rewritten to use pytorch instead of tensorflow. The training scripts and self-play scripts should be updated to account for the new implementation, but feel free to open an issue if something was overlooked.

Many thanks to "lionfenfen" and YiXiaoTian, and "Medwin", for contributing ideas and discussions and testing for improving the training, and "inbae" for the initial work and catalyst of the new pytorch implementation. Many thanks also to those on the discord server who helped with testing.

Once enough contributors have switched to this release, the new architecture will also be integrated into KataGo's main public run, where hopefully it can drive future improvement. If you are a contributor to http://katagotraining.org/, please upgrade if you can. Thanks again to everyone!

Other Notable Changes

Analysis Engine (doc)

  • Added "terminate_all" command. #727
  • Analysis engine now echos errors and warnings for bad queries to stderr by default, which can be optionally disabled. #728
  • A few additional values are now reported, including "weight", and two parameters rawStWrError rawStScoreError that measure the neural net's estimation of its own uncertainty about a position.
  • Fixed minor weighting oddity in calculation of pruned root values.

Self-play training and Contribution

  • Self-play data generation and/or contribution to the public run are changed to calculate and record slightly less-noisy values now for auxiliary value training.
  • Slightly better automated error checking was added for contributing to the public run.
  • Added some parameters to support more flexible komi initialization in selfplay. These will likely also be used for KataGo's public run soon after enough people upgrade.
  • Fixed bug in turn number computation for SGFs that specify board positions when generating sgfposes/hintposes for self-play training.

Algorithm and performance improvements

  • Improved LCB implementation to be smoother and also handle lower visits, giving a mild strength improvement for low-visits.
  • Fixed possible bug with extreme komi inputs to the neural net.
  • Improved OpenCL performance tuning logic.

Other

  • Cleaned up and clarified gtp_example.cfg. #714
  • Fixed several more bugs in the handling of non-ascii file paths.
  • Cleanup and much greater flexibility added to KataGo's config system and logging #630
  • Fixed bug where KataGo would not handle some SGF placements correctly if those placements edited and replaced many different stones on the board in the middle of a game, where all edits together would be legal but subsets of those edits might be illegal.

Graph Search and Other Improvements

20 Mar 20:42
Compare
Choose a tag to compare

If you're a new user, don't forget to check out this section for getting started and basic usage! If you don't know which version to choose (OpenCL, CUDA, TensorRT Eigen, Eigen AVX2), read this: https://github.com/lightvector/KataGo#opencl-vs-cuda-vs-tensorrt-vs-eigen

Also, KataGo is continuing to improve at https://katagotraining.org/ and if you'd like to donate your spare GPU cycles and support it, it could use your help there!

As before, attached here are "bs29" versions of KataGo. These are just for fun, and don't support distributed training but DO support board sizes up to 29x29. They may also be slower and will use much more memory, even when only playing on 19x19, so you should use them only when you really want to try large boards.

Changes This Release

Search Improvements

Graph Search

KataGo has a new stronger MCTS implementation that operates on a graph rather than a tree! Different move sequences that lead to the same positions are recombined and searched only once instead of separately, with careful handling of ko and superko situations to ensure provable partial guarantees on the correctness of the recombining. The amount of improvement given by graph search appears to be highly variable, depending heavily on the particular hardware, number of threads, thinking time per move or number of playouts, and balance of CPU/GPU speed, ranging from no improvement at all on some configurations, to nearly 100 Elo on some configurations.

Better Parameters

A few of KataGo's default search parameters have been slightly tuned and improved. Additionally, KataGo implements a new FPU (first-play-urgency) method that implements the heuristic "if existing moves have turned out worse than expected, prefer exploring entirely new moves more, and if existing moves have turned out better than expected, prefer verifying those moves more and exploring entirely new moves less". All of these changes might be worth somewhere from 15 to 40 Elo together. Thanks to sbbdms, fuhaoda, Wing, and many others for helping test these parameters.

Interface and User Improvements

  • The katago contribute command for contributing to the public distributed run at https://katagotraining.org/ now supports pausing and resuming!

    • While KataGo is running, if you type pause and hit enter, KataGo will stop its CPU and GPU usage (may take a few seconds to a minute), but will continue to remember its current state (note: KataGo will continue using RAM and GPU memory to do so, pausing only halts computation).
    • Use resume to resume computation.
    • You can also give the commands quit and forcequit which correspond to pressing Ctrl-C once or twice, causing KataGo to exit after finishing current contribution games, or exit as soon as possible discarding unfinished games.
  • On Windows, KataGo should now handle non-ascii file paths and directories. Hopefully for real this time.

Changes for Developers

  • As a result of graph search, the pvVisits array indicating the number of visits on at each point of a PV may no longer be monotone, since a position with few visits may transpose to a position with many visits. A new pvEdgeVisits output is now available that distinguishes the count of visits that reach a given position, and the count of visits that make a particular move (since a given position may now be reached by more than one move along the graph).
  • The kata-analyze command in GTP now can report the predicted ownership for each individual move (see movesOwnership in the documentation).
  • There is a new GTP extension kata-benchmark NVISITS which will run a simple benchmark from within GTP.
  • Fixed a bug in KataGo book hashing that might theoretically cause incorrect book transpositions, and greatly reduced the disk space requirements for KataGo's book file. Both the bugfix and reduction only applies to new books generated with the newest version.
  • Added checkbook command to test the integrity of a book file.

Other Improvements and Bugfixes

  • Added a link to a simple/elegant KataGo-based GUI, Ogatak.
  • Added option to contribute to perform rating games only, and documented a few more minor options.
  • GTP printsgf command now records into the SGF an intelligent final score rather than a naive all-stones-are-alive score.
  • Fixed bug where avoid moves option in GTP and analysis engine might include avoided moves anyways if the search was performed for the player that would not normally move next.
  • Fixed bug in the computation that considered whether to suppress a pass in some cases for rules compatibility.
  • Significantly optimized ownership calculation speed for long/deep searches.
  • Selfplay now records a few additional stats about policy and search surprise and entropy into the .npz files.
  • Python neural net model training now tracks export cycle across restarts.
  • Major internal refactors and cleanups of KataGo's search code.
  • Various other documentation improvements

TensorRT Backend, Many Minor Improvements

24 Oct 19:55
Compare
Choose a tag to compare

If you're a new user, don't forget to check out this section for getting started and basic usage! If you don't know which version to choose (OpenCL, CUDA, TensorRT Eigen, Eigen AVX2), read this: https://github.com/lightvector/KataGo#opencl-vs-cuda-vs-tensorrt-vs-eigen

Also, KataGo is continuing to improve at https://katagotraining.org/ and if you'd like to donate your spare GPU cycles and support it, it could use your help there!

As before, attached here are "bs29" versions of KataGo. These are just for fun, and don't support distributed training but DO support board sizes up to 29x29. They may also be slower and will use much more memory, even when only playing on 19x19, so you should use them only when you really want to try large boards.

New TensorRT Backend

There is a new TensorRT backend ("trt8.2") in this release thanks to some excellent work by hyln9! On strong NVIDIA GPUs, this backend can often be 1.5x the speed of any other backend. It is NOT universally faster however, sometimes the CUDA backend can still be faster than the TensorRT backend. The two backends may also prefer different numbers of threads - try running the benchmark to see. TensorRT also tends to take noticeably longer to start up.

Using TensorRT requires an NVIDIA GPU and CUDA 11.1+ and CUDNN 8.2+ and TensorRT 8.2 (precompiled executables in this release use CUDA 11.1 for linux and CUDA 11.2 for Windows) which you can download and install manually from NVIDIA: https://developer.nvidia.com/tensorrt, https://developer.nvidia.com/cuda-toolkit, https://developer.nvidia.com/cudnn.

If you want an easier out-of-the-box setup and/or are using other GPUs, then OpenCL is still recommended as the easiest to get working.

Minor Features and Improvements

  • KataGo antimirror logic for GTP is slightly improved.
  • Analysis engine and kata-analyze now support reporting the standard deviation of ownership across search ("ownershipStdev")
  • Added minor options for random high-temperature policy initialization to katago match command.
  • Very slight cross-backend performance improvement - most configurations by default will now avoid multi-board-size GPU masking code if only one board size is used. (Analysis engine is the one major exception, you must specify requireMaxBoardSize, maxBoardXSizeForNNBuffer, maxBoardYSizeForNNBuffer in the config and then must not query for other board sizes).
  • Added the code used to generate the books at https://katagobooks.org/, runnable by ./katago genbook with example config at https://github.com/lightvector/KataGo/blob/master/cpp/configs/book/genbook7jp.cfg. You can generate your own books if you like, although be prepared to dive into the source code if you want to know exactly what particular parameters do.

Bugfixes

  • KataGo should now (hopefully) handle non-ascii file paths on Windows.
  • GTP/Analysis "avoid" option now correctly applies when there is only 1-playout and moves are based on raw policy.
  • GTP/Analysis "avoid" option now correctly interacts with root symmetry pruning.
  • Fixed various bugs with GTP command loadsgf
  • Fixed minor issue reporting analysis values for terminal positions.
  • Fixed issue where during multithreading analysis would report zero-visit moves with weird stats.
  • Fix minor possible race if multiple katago distributed training contributes are started at once on the same machine.
  • More-reliably tolerate and retry corrupted downloads in contribute command for online distributed training
  • Benchmark now respects defaultBoardSize in config.
  • Fixed issue in cmake build setup with mingw in Windows.
  • Fixed issue with swa_model namespace when loading a preexisting model for train.py for model training.

Analysis engine bugfixes

30 Jun 03:26
Compare
Choose a tag to compare

If you're a new user, don't forget to check out this section for getting started and basic usage! If you don't know which version to choose (OpenCL, CUDA, Eigen, Eigen AVX2), read this: https://github.com/lightvector/KataGo#opencl-vs-cuda-vs-eigen

This is a quick bugfix release. See notes for previous release for info about for major recent improvements, including significant strength improvements, new features you can specify to configure KataGo's behavior and make it play a wider variety of moves, and various performance enhancements.

Also, KataGo is continuing to improve at https://katagotraining.org/ and if you'd like to donate your spare GPU cycles and support it, it could use your help there!

As before, attached here are "bs29" versions of KataGo. These are just for fun, and don't support distributed training but DO support board sizes up to 29x29. They may also be slower and will use much more memory, even when only playing on 19x19, so you should use them only when you really want to try large boards.

Changes In This Release

  • Fixed bug where analysis engine would crash if a query was terminated before the query began analyzing.
  • Analysis engine will now output ownership to an accuracy of 10^-6 and all other values to an accuracy of 7-8 decimal places past the most significant digit. Hopefully this is more than enough precision for all practical purposes, while noticeably reducing the response message size.
  • Parameter overrides in the analysis engine for entirely unknown parameters will now warn and still perform the query ignoring that parameter instead of producing an error.
  • Got rid of a harmless race in the contribute command for KataGo distributed training that could produce slightly more confusing output or error messages.

Better Search, Threads, Analysis Improvements, and More

28 Jun 04:04
Compare
Choose a tag to compare

If you're a new user, don't forget to check out this section for getting started and basic usage!

KataGo is continuing to improve at https://katagotraining.org/ and if you'd like to donate your spare GPU cycles and support it, it could use your help there!

If you don't know which version to choose (OpenCL, CUDA, Eigen, Eigen AVX2), read this: https://github.com/lightvector/KataGo#opencl-vs-cuda-vs-eigen

Also attached here are "bs29" versions of KataGo. These are just for fun, and don't support distributed training but DO support board sizes up to 29x29. KataGo's neural nets will probably be still very strong on large boards, but as usual they are not trained for these sizes, so no complete guarantees. The "bs29" version may also be slower and will use much more memory, even when only playing on 19x19, so you should use them only when you really want to try large boards.

Major Changes and Improvements and New Features

  • Major improvements in KataGo's search algorithm. KataGo might be somewhere around 75 Elo stronger than v1.8.2 with the same latest neural nets. Half of this improvement might also be applicable to older/smaller nets, although not tested. Thanks to @sbbdms and @fuhaoda for extensive help testing these parameters.

  • Major improvements in multithreaded search performance on stronger GPUs, or with multiple GPUs, by implementing close-to-lockless MCTS search. (merging the long-open "highthreads" code branch). In the extreme case, performance on multiple A100s might be more than doubled.

  • New option avoidRepeatedPatternUtility to make KataGo prefer to avoid playing a joseki that it already played the same game in a different corner. See config for more details.

  • New set of options where you can specify your own SGF files to encourage KataGo to avoid (or play) various lines, or to avoid repeating itself and play a greater variety of moves across a series of games. See config for details.

  • It's surprisingly hard to find a nice tool for summarizing win/loss results for SGFs and computing Elos given a set of test matches between different bots/configs/players. KataGo now has a small python3 library and script that does it, run like python summarize_sgfs.py /path/to/directory/of/sgfs, run it with --help for more info.

  • KataGo now leverages symmetry for searching the first few moves of the game if the position is symmetric, and will open as black in the upper right corner by default. Thanks to @fuhaoda for helping implement this.

  • KataGo will now search a much wider variety of moves during analysis by default. (analysisWideRootNoise).

  • For OpenCL users: somewhat improved the reliability of the OpenCL tuning to find good configurations and not pick bad ones as often, on some GPUs. KataGo v1.9.0 by default will continue to use the same tuning as from earlier versions, but if you want to re-run the tuner on v1.9.0, at any time you can run or rerun it like ./katago.exe tuner -model path/to/the/neuralnetfile.bin.gz -config path/to/your_gtp_config.cfg. Thanks to @EZonGH for reporting and testing.

Dev-facing Changes (Analysis Engine)

  • Added new clear_cache command to analysis engine.

  • Analysis engine now also reports the current player to move in the root info.

  • Analysis engine and GTP kata-analyze are now updated to report (isSymmetryOf) when a move's stats are symmetric copies of other move's stats, for when KataGo's new symmetry handling, rootSymmetryPruning enabled. Basically KataGo only searches one of each symmetrically equivalent move, and the stats for the others are set to be copies of the original, (with appropriately rotated PVs).

Bugfixes and Optimizations

  • Mostly mitigated the problem where if there are too many search threads some of those threads are forced to search poor moves (since all the good moves are already taken by other threads), whose poor values then bias the MCTS average values enough to cause KataGo to miss tactics or play very weird moves.

  • Reduce up the lag between moves when using large numbers of playouts (hundreds of thousands or millions) by multithreadedly deallocating the search tree or processing it for tree reuse.

  • Fixed several rare memory access and threading bugs, including one that could rarely cause KataGo to crash outright.

  • Improved the quality of rootInfo for analysis engine (thanks to @sanderland)

  • Some small internal performance optimizations in the board code (thanks to @fuhaoda).

  • Fixed some warnings and CMake issues for compiling with clang (thanks to @TFiFiE).

Distributed Training - Selfplay Diversity Fixes

19 Apr 03:52
Compare
Choose a tag to compare

If you're a new user, don't forget to check out this section for getting started and basic usage!

KataGo is continuing to improve at https://katagotraining.org/ and if you'd like to donate your spare GPU cycles and support it, it could use your help there!

If you don't know which version to choose (OpenCL, CUDA, Eigen, Eigen AVX2), read this: https://github.com/lightvector/KataGo#opencl-vs-cuda-vs-eigen

What's New This Version

If you are a user who helps with distributed training on https://katagotraining.org/ it would be great if you could update to this version, which is also the new tip of the stable branch as soon as it is convenient! And let me know in the issues if there are any new problems you encounter with it. Thanks!

This is a minor release mainly of interest to contributors to KataGo's distributed run, or to users who run KataGo self-play training on their own GPUs. This release doesn't make any changes to the KataGo engine itself, but it does fix an issue that was believed to be limiting the diversity in KataGo's self-play games. Switching to this version should over the long term of training, improve KataGo's learning, particularly on small boards, as well as enable a few further parameter changes in the future once most people have upgraded which should also further improve opening diversity.

And, still coming hopefully not too long after this should be a release with some strength and performance improvements for general users. :)

Changes

  • Separately sample komi for initializing the board versus actually playing the game
  • Rework and refactor komi initialization to make komi randomization more consistent. Newly consistent applies it to some cases missed before (e.g. cleanup training)
  • Polish up and improve many aspects of the logic for a few commands of interest to devs who run specialized training, such as dataminesgfs.

Distributed Client Bugfixes, Friendly Passing, Engine Bugfixes

14 Mar 20:54
Compare
Choose a tag to compare

If you're a new user, don't forget to check out this section for getting started and basic usage!

KataGo is continuing to improve at https://katagotraining.org/ and if you'd like to donate your spare GPU cycles and support it, it could use your help there! And development is still active! Coming down the pipe in the future, although not in time for this release, there are some search improvements being worked on, which should provide a major strength boost.

If you don't know which version to choose (OpenCL, CUDA, Eigen, Eigen AVX2), read this: https://github.com/lightvector/KataGo#opencl-vs-cuda-vs-eigen

Distributed Client Improvements (for contributing to the public run at katagotraining.org)

  • Improved handling of errors and network connection issues, including fixing at least two bugs that could cause endless hangs or deadlocks.

  • Distributed client now will attempt to stop gracefully on the first interrupt (e.g. ctrl-c), finishing its current remaining games. A second interrupt will force a stop more immediately.

  • Distributed client now supports https_proxy environment variable for users using a proxy to connect to the internet.

  • The OpenCL version of the client will now tune all necessary model sizes up-front upon startup the first time, which should be more accurate and less-disruptive than tuning them in the middle when a new model of that size is downloaded. This may take a little while, so please be patient.

Engine Improvements

  • Friendly passing - when KataGo is set to named rules other than tromp-taylor, if the opponent has just passed, it will more often pass in response without attempting to do much cleanup of dead stones, so long as there are no gainful moves left.

    • When NOT specifying rules by name (like japanese or aga), but rather by individual options (koRule, scoringRule,...), the new individual option controlling this behavior is friendlyPassOk = true or friendlyPassOk = false (default).
  • KataGo now supports Fischer time controls, along with new GTP extensions kata-time_settings and kata-list_time_settings documented here with hopefully rigorous-enough semantics to support future time control additions and be implementable by any other engine that wants to mimic the same spec.

  • GTP can now change the number of search threads at runtime, via kata-set-param, documented here.

  • GTP final-score and related commands should also now behave in accordance with friendly passing, reporting the Tromp-Taylor score after two passes in tromp-taylor rules, and the estimated human-friendly score after dead stone removal in other rules.

  • Fixed GTP bug where certain commands relating to scoring or ending the game (e.g. final_status_list) might silently alter settings like playoutDoublingAdvantage for that run of GTP.

  • Various minor improved logging and error messages in multiple top-level commands and utilities.

  • Fixed issue where using a large number of threads could sometimes make GTP final score estimation inaccurate, by capping the number of threads.

Dev-facing Improvements and Internal Changes

  • The analysis engine now reports two new fields thisHash and symHash for the root info, which are Zobrist hashes of the situation being analyzed that can be used to identify or distinguish positions and their symmetrical partners.

  • Fixed a bug in the vartime auxiliary training target where the loss wasn't being weighted correctly, causing major biases in the neural net's learning of this target.

  • Several internal cleanups and refactors of some dev tools for sampling and mining SGFs and running test positions, added some command-line arguments for filtering SGFs based on some criteria.

  • Added some logic to handle parsing of the oddly-formatted komi and ranks for Fox server SGFs.

  • Updated to a version newer version of a json-parsing library to fix an issue where on specific versions of MSVC, the older json library would cause compile errors.

New Nets, Stronger Search, Distributed Client, Many Bugfixes

14 Jan 05:24
Compare
Choose a tag to compare

If you're a new user, don't forget to check out this section for getting started and basic usage!

KataGo has started a new distributed run at https://katagotraining.org/, and this release newly supports the latest and strongest neural nets from there! Also if you wish to contribute, the run will be open for full public contributions soon, but for now, you can already try the new nets. For nets from older runs, see https://d3dndmfyhecmj0.cloudfront.net/index.html.

If you don't know which version to choose (OpenCL, CUDA, Eigen, Eigen AVX2), read this: https://github.com/lightvector/KataGo#opencl-vs-cuda-vs-eigen

Major Engine Changes and Fixes

  • Now supports the new neural nets at https://katagotraining.org/, which have an altered format and some new output heads that might be used to improve the search logic in future versions.

  • A lot of internal changes with hopefully all the critical changes needed to support public contributions for the distributed run opening shortly, as well as many bugfixes, and stronger search logic.

  • New subtree value bias correction method has been added to the search, which should be worth somewhere between 20 and 50 Elo for mid-thousands of playouts.

  • Fixed a bug in LCB move selection that prevented LCB from acting on the top-policy move. The fix is worth perhaps around 10 Elo.

  • Time control logic has been greatly overhauled and reimplemented. Most of its features are not enabled by default due to uncertainty on the best parameters, they may be set to reasonable defaults after more testing in the future. (Anyone interested in running tests or collaborating further logic tweaks would be welcome!)

  • Bugfix to Japanese-like rules that should allow for more accurate handling of double-ko-death situations. Will also require new nets to gradually adjust to these rules, which may take some more time with the ongoing new run.

  • Root symmetry sampling now samples without replacement instead of with replacement, and is capped at 8, the total number of possible symmetries, instead of 16.

Minor Engine Changes and Fixes

  • Removed old no-longer-useful search parameter fpuUseParentAverage.

  • Built-in katago match tool's komiAuto feature now uses 100 visits per test instead of 20 by default to find a fair komi.

  • Built-in katago match tool now has some logic to avoid prematurely-early resignation, to be consistent with GTP.

  • Fixed a segfault that could happen during config generation in katago genconfig command.

  • Fixed bug where analysis engine could sometimes report the rootInfo with the wrong side's perspective.

  • Fixed bug where priorities outside [-2^31, 2^31-1] would not work properly in the analysis engine.

  • Fixed GTP command kata-raw-nn to also report the policy for passing.

Self-play and Training Changes

  • Neural net model version 10 is now the default version, which adds a few new training targets and rebalances all of the weights of the loss function. Training and loss function statistics may not be directly comparable to those of earlier versions.

  • Going forward, newly-created neural nets with the KataGo python scripts will default to using a 3x3 conv instead of a 5x5 conv for the first layer. This may result in newly-trained neural nets being very slightly weaker and lower-capacity, and very slightly faster than old nets. This also greatly reduces memory usage on bigger nets with OpenCL. Existing nets will be unaffected (even if v1.8 is used train them).

  • Fixed bug where hintposes were not adjusted for the initial turn number of the position.

  • Some SGF startposes file handling is improved to allow deeper-branching files to be handled without running out of stack space.

  • Fixed bug where a stale root nn policy might suppress a hintpos from taking effect. Hintposes will also do more full searches instead of cheap searches in the few moves after the hint.

  • Improved logging of debug output from self-play training, improved SGF file comments for selfplay games, various internal cleanups.

  • Training script now has option to lock the ratio of train steps vs data samples.

  • Easier usage of initial weights for training - train script will look for any tensorflow checkpoints and meta files within a directory named "initial_weights" that is a subdirectory of that specific net's training directory.

  • Deleted some of the old unused model code.

  • Just for fun, added some pytorch genboard scripts that train a neural net to generate plausible board positions given some existing stones on that board.

CUDA 11, Analysis Engine Features, Prepare for Distributed

09 Nov 05:14
Compare
Choose a tag to compare

If you're a new user, don't forget to check out this section for getting started and basic usage!
The latest and strongest neural nets are still those from the former release: https://github.com/lightvector/KataGo/releases/tag/v1.4.5
If you don't know which version to choose (OpenCL, CUDA, Eigen, Eigen AVX2), read this: https://github.com/lightvector/KataGo#opencl-vs-cuda-vs-eigen

This release contains a variety of minor bugfixes and minor feature additions. It also incorporates a large number of internal changes to prepare for and support a distributed training run (yay), although distributed training support has deliberately not been enabled yet for the precompiled executables this release.

General Improvements and Features

  • Supports CUDA 11.1 now, which makes it possible to use KataGo CUDA instead of only OpenCL with NVIDIA RTX 30** GPUs. Beware though that on other GPUs CUDA 11.1 might not actually be faster than 10.2 - in one test on a V100 cloud machine, CUDA 11.1 seemed to be slower than CUDA 10.2. And possible changes to OpenCL speed and to CUDA speed on RTX 30** are also unknown and seem to vary - some users have reported exciting results, some have reported fairly disappointing ones.

  • Added new gtp config option "ignoreGTPAndForceKomi" that will force a particular komi regardless if the GTP controller tries to specify a different one. And KataGo is also now slightly smarter about guessing default komi based on other rules in the case where absolutely nothing tells KataGo what it should be.

  • KataGo no longer requires boost libraries in order to be compiled.

  • OpenCL backend optimized to now require less GPU memory.

  • Benchmark command should now be more efficient about choosing search ranges for threads.

Analysis Engine

There are several improvements to the json analysis engine.

  • Can now report the predicted ownership map for each individual move.

  • Can now report results from an ongoing query, making it possible to do the same things you would with kata-analyze or lz-analyze.

  • Can now cancel or terminate queries before they finish.

  • Can now specify differing per-turn priorities in a single query.

  • Supports priorities outside the range +/- 2^31, making it easier to do priorities based on timestamps or externally-determined large id numbers, or very, very long-running processes.

Bugfixes

  • Fixes a coding error that would make it sometimes impossible for KataGo to select the optimal move near the end of a game with button Go rules. (Button Go is a ruleset that KataGo supports that has the rules-simplicity and elegance of area scoring, but with the sharper and fairer scoring granularity of territory scoring).

  • Fix minor parsing bug on some uses of -override-config

  • Fixed some bugs on how the benchmark command behaved with threads for the Eigen backend.

Other Changes

  • Shuffle script for selfplay training, which long ago dropped support for shuffling training and validation data separately, now also uses a filepath that just shuffles all data together.

  • A large number of internal refactors and changes have been made to support acting as a client for distributed training. The cmake option BUILD_DISTRIBUTED=1 will make KataGo compile with support for distributed training, although the official distributed run has not quite started yet.

Eigen Memory Bugfixes

25 Aug 22:54
Compare
Choose a tag to compare

If you're a new user, don't forget to check out this section for getting started and basic usage!
The latest and strongest neural nets are still those from the former release: https://github.com/lightvector/KataGo/releases/tag/v1.4.5

A lot of recent and interesting release notes can still be found at this prior release https://github.com/lightvector/KataGo/releases/tag/v1.6.0, including basic information about the new KataGo CPU backend (eigen vs eigenavx2), and a couple of significant new features for analysis tool developers!

Changes

This is a followup release that fixes some issues with the new Eigen (CPU) implementation in 1.6.0:

  • Fixed two issues that caused Eigen implementation to use massively more memory than it needed, particularly when run with many threads (which could exhaust all RAM on some systems).
  • Better default settings of the numbers of threads to use in Eigen, which are now overrideable by a new separate config parameter if needed (but users should just stick to the default anyways).

And some minor other changes:

  • For the analysis engine, the number of positions to search in parallel is now controlled by numAnalysisThreads in the config instead of a command line argument (but the command line argument still works, for backwards compatibility).
  • The analysis engine config now allows specifying numSearchThreadsPerAnalysisThread as an alias for numSearchThreads. This is not new behavior, this is just an alias whose name hopefully conveys the effect of this parameter better.