Update README.md
Browse files
README.md
CHANGED
@@ -59,16 +59,16 @@ This model achieves an eval_loss of 0.419 at step 7,600 (approximately 10.90 epo
|
|
59 |
- Collate more Go data, particularly self play data. It is quite clear that the size of the existing data is quite trivial compared to modern language model.
|
60 |
|
61 |
# Reference
|
62 |
-
[1] Silver, D., Huang, A., Maddison, C. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).
|
63 |
-
[2] D. Silver, T. Hubert, J. Schrittwieser, I. Antonoglou, M. Lai, A. Guez, M. Lanctot, L. Sifre, D. Kumaran, T. Graepel, et al., “Mastering chess and shogi by self-play with a general reinforcement learning algorithm,” arXiv preprint arXiv:1712.01815, 2017.
|
64 |
-
[3] Coulom, R. Efficient selectivity and backup operators in Monte-Carlo tree search. In 5th International Conference on Computer and Games, 72–83 (2006).
|
65 |
-
[4] Kocsis, L. & Szepesvari, C. Bandit based Monte-Carlo planning. In ´ 15th European Conference on Machine Learning, 282–293 (2006).
|
66 |
-
[5] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. In Advances in Neural Information Processing Systems, pages 6000–6010, 2017.
|
67 |
-
[6] Radford, Alec and Karthik Narasimhan. “Improving Language Understanding by Generative Pre-Training.” (2018).
|
68 |
-
[7] D. Noever, M. Ciolino, and J. Kalin. The Chess Transformer: Mastering Play using Generative Language Models, Sept. 2020.
|
69 |
-
[8] Zhang, Edwin et al. “Transcendence: Generative Models Can Outperform The Experts That Train Them.” (2024).
|
70 |
-
[9] Ciolino, Matthew et al. “The Go Transformer: Natural Language Modeling for Game Play.” 2020 Third International Conference on Artificial Intelligence for Industries (AI4I) (2020): 23-26.
|
71 |
-
[10] Radford, Alec et al. “Language Models are Unsupervised Multitask Learners.” (2019).
|
72 |
|
73 |
## Citation
|
74 |
|
|
|
59 |
- Collate more Go data, particularly self play data. It is quite clear that the size of the existing data is quite trivial compared to modern language model.
|
60 |
|
61 |
# Reference
|
62 |
+
[1] Silver, D., Huang, A., Maddison, C. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).
|
63 |
+
[2] D. Silver, T. Hubert, J. Schrittwieser, I. Antonoglou, M. Lai, A. Guez, M. Lanctot, L. Sifre, D. Kumaran, T. Graepel, et al., “Mastering chess and shogi by self-play with a general reinforcement learning algorithm,” arXiv preprint arXiv:1712.01815, 2017.
|
64 |
+
[3] Coulom, R. Efficient selectivity and backup operators in Monte-Carlo tree search. In 5th International Conference on Computer and Games, 72–83 (2006).
|
65 |
+
[4] Kocsis, L. & Szepesvari, C. Bandit based Monte-Carlo planning. In ´ 15th European Conference on Machine Learning, 282–293 (2006).
|
66 |
+
[5] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. In Advances in Neural Information Processing Systems, pages 6000–6010, 2017.
|
67 |
+
[6] Radford, Alec and Karthik Narasimhan. “Improving Language Understanding by Generative Pre-Training.” (2018).
|
68 |
+
[7] D. Noever, M. Ciolino, and J. Kalin. The Chess Transformer: Mastering Play using Generative Language Models, Sept. 2020.
|
69 |
+
[8] Zhang, Edwin et al. “Transcendence: Generative Models Can Outperform The Experts That Train Them.” (2024).
|
70 |
+
[9] Ciolino, Matthew et al. “The Go Transformer: Natural Language Modeling for Game Play.” 2020 Third International Conference on Artificial Intelligence for Industries (AI4I) (2020): 23-26.
|
71 |
+
[10] Radford, Alec et al. “Language Models are Unsupervised Multitask Learners.” (2019).
|
72 |
|
73 |
## Citation
|
74 |
|