Proof-reading
Browse files
README.md
CHANGED
@@ -78,8 +78,8 @@ def calculate_log_probabilities(model: PreTrainedModel, tokenizer: Tokenizer, in
|
|
78 |
```
|
79 |
|
80 |
Explanation:
|
81 |
-
- we drop the logits for the last token, because
|
82 |
-
- we compute the softmax over the last dimension (vocab size), to obtain probability distribution over all tokens
|
83 |
- we drop the first token because it is a start-of-sequence token
|
84 |
- `log_probs[0, range(log_probs.shape[1]), tokens]` indexes into log_probs such as to extract
|
85 |
- at position 0 (probability distribution for the first token after the start-of-sequence token) - the logprob value corresponding to the actual first token
|
|
|
78 |
```
|
79 |
|
80 |
Explanation:
|
81 |
+
- we drop the logits for the last token, because they correspond to the probability of the next token (we have no use for it because we are not generating text)
|
82 |
+
- we compute the softmax over the last dimension (vocab size), to obtain the probability distribution over all tokens
|
83 |
- we drop the first token because it is a start-of-sequence token
|
84 |
- `log_probs[0, range(log_probs.shape[1]), tokens]` indexes into log_probs such as to extract
|
85 |
- at position 0 (probability distribution for the first token after the start-of-sequence token) - the logprob value corresponding to the actual first token
|