-
Notifications
You must be signed in to change notification settings - Fork 23
/
Copy pathchangelog.txt
477 lines (341 loc) · 16.4 KB
/
changelog.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
v0.19.1 (2018-12-10)
~~~~~~~
(no changes relative to v0.19.1-rc2)
v0.19.1-rc2 (2018-12-07)
~~~~~~~~~~~
* Temperature and FPU related params. (#568)
* Rework Cpuct related params. (#567)
v0.19.1-rc1 (2018-12-06)
~~~~~~~~~~~
* Updated cpuct formula from alphazero paper. (#563)
* remove UpdateFromUciOptions() from EnsureReady() (#558)
* revert IsSearchActive() and better fix for one of #500 crashes (#555)
v0.19.0 (2018-11-19)
~~~~~~~
* remove Wait() from EngineController::Stop() (#522)
v0.19.0-rc5 (2018-11-17)
~~~~~~~~~~~
* OpenCL: replace thread_local with a resource pool. (#516)
* optional wtime and btime (#515)
* Make convolve1 work with workgroup size of 128 (#514)
* adjust average depth calculation for multivisits (#510)
v0.19.0-rc4 (2018-11-12)
~~~~~~~~~~~
* Microseconds have 6 digits, not 3! (#505)
* use bestmove_is_sent_ for Search::IsSearchActive() (#502)
v0.19.0-rc3 (2018-11-07)
~~~~~~~~~~~
* Fix OpenCL tuner always loading the first saved tuning (#491)
* Do not show warning when ComputeBlocking() takes too much time. (#494)
* Output microseconds in log rather than milliseconds. (#495)
* Add benchmark features (#483)
* Fix EncodePositionForNN test failure (#490)
v0.19.0-rc2 (2018-11-03)
~~~~~~~~~~~
* Version v0.19.0-rc1 reported it's version as v0.19.0-dev
Therefore v0.19.0-rc2 is released with this issue fixed.
v0.19.0-rc1 (2018-11-03)
~~~~~~~~~~~
* Search algorithm changes
When visiting terminal nodes and collisions, instead of counting that as one
visit, estimate how many subsequent visits will also go to the same node, and
do a batch update.
That should slightly improve nps near terminal nodes and in multithread
configurations. Command line parameters that control that:
--max-collision-events – number of collision events allowed per batch.
Default is 32. This parameter is roughly equivalent to
--allowed-node-collisions in v0.18.
--max-collision-visits – total number of estimated collisions per NN batch.
Default is 9999.
* Time management
Multiple changes have been done to make Leela track used time more precisely
(particularly, the moment when to start timer is now much closer to the moment
GUIs start timer).
For smart pruning, Leela's timer only starts when the first batch comes from
NN eval. That should help against instamoves, especially on non-even GPUs.
Also Leela stops the search quicker now when it sees that time is up (it could
continue the search for hundreds of milliseconds after that, which caused time
trouble if opponent moves very fast).
Those changes should help a lot in ultra-bullet configurations.
* Better logging
Much more information is outputted now to the log file. That will allow us to
easier diagnose problems if they occur. To have debug file written, add a
command line option:
--logfile=/path/to/logfile
(or short option "-l /path/to/logfile", or corresponding UCI option "LogFile")
It's recommended to always have logging on, to make it easier to report bugs
when it happens.
* Configuration parameters change
Large part of parameter handling has been reworked. As the result:
All UCI parameters have been changed to have more "classical" look.
E.g. was "Network weights file path", became "WeightsFile".
Much more detailed help is shown than before when you run
./lc0 --help
Some flags have been renamed, e.g.
--futile-move-aversion
is renamed back to
--smart-pruning-factor.
After setting a parameter (using command line parameter or uci setoption
command), uci command "uci" shows updated result. That way you can check the
current option values.
Some command-line and UCI options are hidden now. Use --show-hidden command
line parameter to unhide them. E.g.
./lc0 --show-hidden --help
Also, in selfplay mode the per player configuration format has been changed
(although probably noone knew that anyway):
Was: ./lc0 selfplay player1: --movetime=14
Became: ./lc0 selfplay --player1.movetime=14
* Other
"go depth X" uci command now causes search to stop when depth information in
uci info line reaches X. Not that it makes much sense for it to work this way,
but at least it's better than noting.
Network file size can now be larger than 64MB.
There is now an experimental flag --ramlimit-mb. The engine tries to estimate
how much memory it uses and stops search when tree size (plus cache size)
reaches RAM limit. The estimation is very rough. We'll see how it performs and
improve estimation later.
In situations when search cannot be stopped (`go infinite` or ponder),
`bestmove` is not automatically outputted. Instead, search stops progress and
outputs warning.
Benchmark mode has been implemented. Run run, use the following command line:
./lc0 benchmark
This feature is pretty basic in the current version, but will be expanded later.
As Leela plays much weaker in positions without history, it now is able to
synthesize it and do not blunder in custom FEN positions. There is a
--history-fill flag for it. Setting it to "no" disables the feature, setting
to "fen_only" (default) enables it for all positions except chess start
position, and setting it to "always" enables it even for startpos.
Instead of output current win estimation as centipawn score approximation,
Leela can how show it's raw score. A flag that controls that is --score-type.
Possible values:
- centipawn (default) – approximate the win rate in centipawns, like Leela
always did.
- win_percentage – value from 0 to 100.0 which represents expected score in
percents.
- Q – the same, but scales from -100.0 to 100.0 rather than from 0 to 100.0
v0.18.1 (2018-10-02)
~~~~~~~
* Fix for falling into threefold repetition in a winning endgame tablebase position.
v0.18.0 (2018-09-30)
~~~~~~~
* No changes from rc2 except the version.
v0.18.0-rc2 (2018-09-26)
~~~~~~~~~~~
* Severe bug fixed: Race condition when out-of-order-eval was enabled (and it
was enabled by default)
* Windows 32-bit builds are now possible (CPU only for now)
v0.18.0-rc1 (2018-09-24)
~~~~~~~~~~~
KNOWN BUG!
* We have credible reports that in some rare cases Lc0 crashes!
However, we were not able to reproduce it reliably. If you see the crash,
please report to devs! What seems to increase crash probability:
- Very short move time (milliseconds)
- Proximity to a checkmate (happens 1-3 moves before the checkmate)
New features:
* Endgame tablebases support! Both WDL and DTZ now.
* Added MultiPv support.
Time management changes:
* Introduced --immediate-time-use flag. Yes, yet another time management
flag. Posible values are between 0.0 and 1.0. Setting it closer to
1.0 makes Leela use time saved from futile search aversion earlier.
* Some time management parameters were changed:
- Slowmover is 1.0 now (was 2.4)
- Immediate-time-use is 0.6 now (didn't exist before, so was 0.0)
* Fixed a bug, because of which futile search aversion tolerance was incorrectly
applied, which resulted in instamoves.
* Now search stops immediately when it runs out of budgeted time.
Should help against timeouts, especially on slow backends (e.g. BLAS).
* Move overhead now is a fixed time, doesn't depend on number of remaining
moves.
Other:
* Out of order eval is on by default. That brings slight nps improvement.
* Default FPU reduction is 1.2 now (was 0.9)
* Cudnn backend now has max_batch parameter.
(can be set for example like this --backend-opts=max_batch=100).
This is needed for lower end GPUs that didn't have enough VRAM for a buffer
of size 1024. Make sure that this setting is not lower than --minibatch-size.
* Small memory usage optimizations.
* Engine name in UCI response is shorter now. Fritz chess UI should be able
to work with Leela now
* Added flag --temp-visit-offset, will allow to offset temperature during
training.
* Command line and UCI parameter values are now checked for validity.
* You can now build for older processors that don't support the popcnt
instruction by passing -Dpopcnt=false to meson when building.
* 32-bit build is possible now. CPU only and we were only able to build it
in Linux for now, including Raspberry Pi.
* Threading issue which caused crash in heavily multithreaded environment
with slow backends was fixed.
v0.17.0 (2018-08-27)
~~~~~~~
No changes from rc2 except the version.
v0.17.0-rc2 (2018-08-21)
~~~~~~~~~~~
* Fixed a bug, that rule50 value was located in wrong place in a training data.
* OpenCL uses much less VRAM now.
* Default OpenCL batch size is 16 now (was 1).
* Default time management related configuration was tweaked:
--futile-move-aversion is 1.33 now (was 1.47)
--slowmover is 2.4 now (was 2.6)
v0.17.0-rc1 (2018-08-19)
~~~~~~~~~~~
New visible features:
* Implemented ponder support.
* Tablebases are supported now (only WDL probe for now).
Command line parameter is
--syzygy-paths=/path/to/syzygy/
* Old smart pruning flag is gone. Instead there is
--futile-search-aversion flag.
--futile-search-aversion=0 is equivalent to old --no-smart-pruning.
--futile-search-aversion=1 is equivalent to old --smart-pruning.
Now default is 1.47, which means that engine will sometimes decide to
stop search earlier even when there is theoretical chance (but not very
probable) that best move decision could be changed if allowed to think more.
* Lc0 now supports configuration files. Options can be listed there instead of
command line flags / uci params.
Config should be named lc0.config and located in the same directory as lc0.
Should list one command line option per line, with '--' in the beginning
being optional, for example:
syzygy-paths=/path/to/syzygy/
* In uci info, "depth" is now average depth rather than full depth
(which was 4 all the time).
Also, depth values do not include reused tree, only nodes visited during the
current search session.
* --sticky-checkmates experimental flag (default off), supposed to find shorter
checkmate sequences.
* More features in backend "check".
Performance optimizations:
* Release windows executables are built with "whole program optimization".
* Added --out-of-order-eval flag (default is off).
Switching it on makes cached/terminal nodes higher priority, which increases
nps.
* OpenCL backend now supports batches (up to 5x speedup!)
* Performance optimizations for BLAS backend.
* Total visited policy (for FPU reduction) is now cached.
* Values of priors (P) are stored now as 16-bit float rather than 32-bit float,
that saves considerable amount of RAM.
Bugfixes:
* Fixed en passant detection bug which caused the position after pawn moving by
two squares not counted towards threefold repetition even if en passant was
not possible.
* Fixed the bug which caused --cache-history-length for values 2..7 work the
same as --cache-history-length=1.
This is fixed, but default is temporarily changed to --cache-history-length=1
during play. (For training games, it's 7)
Removed features:
* Backpropagation beta / backpropagation gamma parameters have been removed.
Other changes:
* Release lc0-windows-cuda.zip package now contains NVdia CUDA and cuDNN .dlls.
v0.16.0 (2018-07-20)
~~~~~~~
* Fully switched to official releases! No more https://crem.xyz/lc0/
* Fixed a bug when pv display and smart pruning didn't sometimes work properly
after tree reuse.
* Format of protobuf network files was changed.
* Autodiscovery of protobuf based network files works now.
lc0-win-20180715-cuda92-cudnn714
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Support of new format of network files (needed for lc0 launch on main
training server)
* Fixed hang/poor performance in the beginning of search when there are many
threads. (Happened on linux only though).
* Memory footprint is reduced a bit. (~-60 bytes per node)
lc0-win-20180711-cuda92-cudnn714
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Edge-node separation introduced a bug that smart pruning didn't work. That's
fixed.
* Changed options parsing so that --backend-opts=cudnn-fp16 is now possible.
* Performance fixes (mostly for slowness introduced by edge-node separation).
lc0-win-20180708-cuda92-cudnn714
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Mutex contention has been reduced (but locking mutex more rarely).
Helps a lot with many threads running. Especially recommended to check with
multi-GPU configuration.
* Memory usage reduced at least 2x (probably more).
* cudnn backend crashed on large batches (>800) that's fixed.
There is still a limit of batch size 1024 though.
* (not in cudnn build, but for completeness)
Fixed NN computation with BLAS backend, it had up to 5% error before that.
* Default time budgeting params have been changed again! (not by mach this time)
--slowmover=1.95
--time-curve-peak=26.2
--time-curve-left-width=82
--time-curve-right-width=74
lc0-win-20180701-cuda92-cudnn714
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* fp16-based computation for very modern NVidia GPUs!
May reduce precision a bit, but should be compensated by nps boost.
Enable with --backend=cudnn-fp16 flag
* V is now not stored in nodes (a bit less RAM used while thinking)
* (not in cudnn build, but listing for completeness) blas batching support.
lc0-win-20180629-cuda92-cudnn714
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Default time budgeting parameters have been changed (again!):
--slowmover=1.93
--time-curve-peak=26
--time-curve-left-width=67
--time-curve-right-width=76
* When generating training games, the engine could confuse client by sending
corrupted output. That's fixed.
lc0-win-20180624-cuda92-cudnn714
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Default time budgeting parameters have been changed:
--slowmover=2.13 (was 1.8)
--time-curve-peak=22.0 (was 41.0)
--time-curve-left-width=450.0 (was 1000.0)
--time-curve-right-width=30.0 (was 39.5)
* During training game generation, the engine is able to send resign statistics.
lc0-win-20180622-cuda92-cudnn714
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Time budged allocation has been changed, it allocates more time to early
stages of the game.
Graphs are here: https://github.com/LeelaChessZero/lc0/pull/59
Slowmover value has so be recalibrated, and default value was changed from 2.2 to 1.8.
* Fixed a race condition in cache prefetch code. Realistically it hardly every
occured before though.
lc0-win-20180619-cuda92-cudnn714-00
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Fix a bug instroduced in version 20180609 which caused the engine to miss checkmates sometimes.
lc0-win-20180614-cuda92-cudnn714
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* "go searchmoves" uci command is now supported
* It's possible now to disable tree reuse in training games
* Few improvements for random backend
* Lc0 now shows version in uci response
* Analyzer mode has been removed
* extra-virtual-loss has been removed
* Implemented resign (for training games)
lc0-win-20180609-cuda92-cudnn714-01
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* In addition to --backpropagate-gamma, there is also --backpropagate-beta!
Default is 1.0.
lc0-win-20180609-cuda92-cudnn714-00
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Visible changes:
* Experimental changes from 20180604 are now default.
* Memory footprint is reduced by 8 bytes per visible node (+ ~240 bytes in
invisible nodes per visible)
* Introduced --backpropagate-gamma flag.
Default is 1.0. There are rumours that reducing it to 0.75 improves play.
* Extra-virtual-loss parameter has been removed.
* Quotes in backend-opts parameter were not parsed properly, that's fixed.
lc0-win-20180604-cuda92-cudnn714-experimental
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Visible changes:
* Experimental default settings:
cPUCT: 3.4
FPU reduction: 0.9
policy Softmax: 2.2
* Fix memory leak when GUI doesn't ever issue `isready` uci command.
lc0-win-20180602-cuda92-cudnn714-00
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Visible changes:
* cPUCT is now 3.1 by default instead of 1.2 (or what it was before)
* Fixed Batch normalization epsilon in tensorflow backend (but noone uses tensorflow anyway)
* Periodically (every 5 seconds) output "uci info" even if bestmove/depth doesn't change.
* Memory management is redone so that node release happens after "bestmove" and "isready", rather than after "position" uci command.
That garbage collection could take tens of milliseconds and chess GUI already started timer at that point.
Memory management is always fragile, so fresh crashes and memory leaks are possible.
Invisible changes:
* Store castlings again as e1g1 and not e1h1. Fixes a bug that tree was not reused after castling.