Update README.md
Browse files
README.md
CHANGED
@@ -26,11 +26,13 @@ Can GoFormer perform reasonably well just by next move (token) prediction, witho
|
|
26 |
We take the leftmost variation of the game tree in SGF format and translate it into PGN.
|
27 |
|
28 |
## Tokenizer Design
|
|
|
29 |
Since it is a 19 x 19 game. We use uppercase alphabet to encode x position and lowercase alphabet to encode y position.
|
30 |
We use alphabet instead of numbers to make a clear that 1 token, but not 2 tokens, represents 1 position, to avoid unnecessary learning to map 2 tokens into 1 position.
|
31 |
We also use a special token '>' to denote the move by the winner's of the game.
|
32 |
While [7][8] does not indicate who is the winner until the result appended at the end, we argue that without indicating the winner, language model cannot know the winner's move during decoding in inference due to the autoregressive nature.
|
33 |
'>' is the symbol to prompt GoFormer for a move during decoding.
|
|
|
34 |
|
35 |
## Model Input and Output
|
36 |
|
@@ -51,7 +53,7 @@ To exclude illegal move, we ask GoFormer to suggests K moves, ranked by probabil
|
|
51 |
This model achieves an eval_loss of 0.419 at step 7,600 (approximately 10.90 epoch).
|
52 |
|
53 |
## Future Work
|
54 |
-
- Collate more Go data
|
55 |
|
56 |
# Reference
|
57 |
[1] Silver, D., Huang, A., Maddison, C. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).
|
|
|
26 |
We take the leftmost variation of the game tree in SGF format and translate it into PGN.
|
27 |
|
28 |
## Tokenizer Design
|
29 |
+
A tokenizer is designed particularly for Go game.
|
30 |
Since it is a 19 x 19 game. We use uppercase alphabet to encode x position and lowercase alphabet to encode y position.
|
31 |
We use alphabet instead of numbers to make a clear that 1 token, but not 2 tokens, represents 1 position, to avoid unnecessary learning to map 2 tokens into 1 position.
|
32 |
We also use a special token '>' to denote the move by the winner's of the game.
|
33 |
While [7][8] does not indicate who is the winner until the result appended at the end, we argue that without indicating the winner, language model cannot know the winner's move during decoding in inference due to the autoregressive nature.
|
34 |
'>' is the symbol to prompt GoFormer for a move during decoding.
|
35 |
+
'X' represents pass.
|
36 |
|
37 |
## Model Input and Output
|
38 |
|
|
|
53 |
This model achieves an eval_loss of 0.419 at step 7,600 (approximately 10.90 epoch).
|
54 |
|
55 |
## Future Work
|
56 |
+
- Collate more Go data, particularly self play data. It is quite clear that the size of the existing data is quite trivial compared to modern language model.
|
57 |
|
58 |
# Reference
|
59 |
[1] Silver, D., Huang, A., Maddison, C. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).
|