BriLLM commited on
Commit
45f64d8
·
verified ·
1 Parent(s): 40f9f99

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +30 -3
README.md CHANGED
@@ -1,8 +1,5 @@
1
  # BriLLM: Brain-inspired Large Language Model
2
 
3
- Our github repo: https://github.com/brillm05/BriLLM0.5
4
-
5
-
6
  ## Overview
7
  This work introduces the first brain-inspired large language model (BriLLM). This is a non-Transformer, non-GPT, non-traditional machine learning input-output controlled generative language model. The model is based on the Signal Fully-connected flowing (SiFu) definition on the directed graph in terms of the neural network, and has the interpretability of all nodes on the graph of the whole model, instead of the traditional machine learning model that only has limited interpretability at the input and output ends.
8
 
@@ -28,12 +25,42 @@ Each token in the vocabulary is modeled as a node, which is defined by a hidden
28
  > To train a sample in BriLLM, every time we build an individual common neural network to perform the regular BP training. This network consists of two parts, in which the front part connects all input nodes (i.e., tokens), then it follows the rear parts which connect all possible paths in order. At last, a softmax layer collects all paths' energy tensors to indicate the right path with a 0-1 ground truth vector. We adopt a cross-entropy loss for training.
29
 
30
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
  ## Installation
32
  ```bash
33
  pip install torch
34
  ```
35
 
36
 
 
 
 
 
37
  ## Inference
38
  ```python
39
  import json
 
1
  # BriLLM: Brain-inspired Large Language Model
2
 
 
 
 
3
  ## Overview
4
  This work introduces the first brain-inspired large language model (BriLLM). This is a non-Transformer, non-GPT, non-traditional machine learning input-output controlled generative language model. The model is based on the Signal Fully-connected flowing (SiFu) definition on the directed graph in terms of the neural network, and has the interpretability of all nodes on the graph of the whole model, instead of the traditional machine learning model that only has limited interpretability at the input and output ends.
5
 
 
25
  > To train a sample in BriLLM, every time we build an individual common neural network to perform the regular BP training. This network consists of two parts, in which the front part connects all input nodes (i.e., tokens), then it follows the rear parts which connect all possible paths in order. At last, a softmax layer collects all paths' energy tensors to indicate the right path with a 0-1 ground truth vector. We adopt a cross-entropy loss for training.
26
 
27
 
28
+ ## Dataset
29
+ We use the subset from the Chinese version of Wikipedia, which contains over 100M Chinese characters. We truncate the long sentences into small sentences with a maximum length of 16.
30
+ We select a vocabulary of 4,000 tokens consisting of the most frequently used Chinese characters.
31
+
32
+
33
+ ## Implementation Details.
34
+ BriLLM is implemented using PyTorch.
35
+ It uses sinusoidal positional encoding, GeLU as the activation function, cross-entropy loss for next-token prediction, and an embedding size of $d_{model} = 32$.
36
+ We used the AdamW optimizer with $\beta_1 = 0.9$, $\beta_2 = 0.999$ and $\epsilon = 10^{-8}$.
37
+ The model size is about $512 + 4000 * 4000 * (32 * 32 + 32) \approx 16B$.
38
+ We trained our models on one machine with 8 NVIDIA A800 GPUs for 1.5k steps.
39
+ ![](./figs/fig4.png)
40
+
41
+
42
+ ## Complexity
43
+ $n$ is the sequence length, $v$ is the vocabulary size, and $d$ is the representation dimension. The computational complexity is $O(n \cdot v \cdot d^2)$.
44
+
45
+
46
+ ## Case Study
47
+ ![](./figs/fig5.png)
48
+
49
+
50
+ ## Comparison of LLM and BriLLM
51
+ ![](./figs/fig6.png)
52
+
53
+
54
  ## Installation
55
  ```bash
56
  pip install torch
57
  ```
58
 
59
 
60
+ ## Checkpoint
61
+ [BriLLM0.5](https://huggingface.co/BriLLM/BriLLM0.5)
62
+
63
+
64
  ## Inference
65
  ```python
66
  import json