Text Generation
Transformers
Safetensors
llama
go
text-generation-inference
Inference Endpoints
kenhktsui commited on
Commit
296d216
·
verified ·
1 Parent(s): 6464c2a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -26,11 +26,13 @@ Can GoFormer perform reasonably well just by next move (token) prediction, witho
26
  We take the leftmost variation of the game tree in SGF format and translate it into PGN.
27
 
28
  ## Tokenizer Design
 
29
  Since it is a 19 x 19 game. We use uppercase alphabet to encode x position and lowercase alphabet to encode y position.
30
  We use alphabet instead of numbers to make a clear that 1 token, but not 2 tokens, represents 1 position, to avoid unnecessary learning to map 2 tokens into 1 position.
31
  We also use a special token '>' to denote the move by the winner's of the game.
32
  While [7][8] does not indicate who is the winner until the result appended at the end, we argue that without indicating the winner, language model cannot know the winner's move during decoding in inference due to the autoregressive nature.
33
  '>' is the symbol to prompt GoFormer for a move during decoding.
 
34
 
35
  ## Model Input and Output
36
 
@@ -51,7 +53,7 @@ To exclude illegal move, we ask GoFormer to suggests K moves, ranked by probabil
51
  This model achieves an eval_loss of 0.419 at step 7,600 (approximately 10.90 epoch).
52
 
53
  ## Future Work
54
- - Collate more Go data
55
 
56
  # Reference
57
  [1] Silver, D., Huang, A., Maddison, C. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).
 
26
  We take the leftmost variation of the game tree in SGF format and translate it into PGN.
27
 
28
  ## Tokenizer Design
29
+ A tokenizer is designed particularly for Go game.
30
  Since it is a 19 x 19 game. We use uppercase alphabet to encode x position and lowercase alphabet to encode y position.
31
  We use alphabet instead of numbers to make a clear that 1 token, but not 2 tokens, represents 1 position, to avoid unnecessary learning to map 2 tokens into 1 position.
32
  We also use a special token '>' to denote the move by the winner's of the game.
33
  While [7][8] does not indicate who is the winner until the result appended at the end, we argue that without indicating the winner, language model cannot know the winner's move during decoding in inference due to the autoregressive nature.
34
  '>' is the symbol to prompt GoFormer for a move during decoding.
35
+ 'X' represents pass.
36
 
37
  ## Model Input and Output
38
 
 
53
  This model achieves an eval_loss of 0.419 at step 7,600 (approximately 10.90 epoch).
54
 
55
  ## Future Work
56
+ - Collate more Go data, particularly self play data. It is quite clear that the size of the existing data is quite trivial compared to modern language model.
57
 
58
  # Reference
59
  [1] Silver, D., Huang, A., Maddison, C. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).