update
Browse files
README.md
CHANGED
@@ -46,8 +46,6 @@ output = merged_model.generate(**model_input, max_new_tokens=1, do_sample=False
|
|
46 |
print(tokenizer.decode(output.sequences[0], skip_special_tokens=True))
|
47 |
```
|
48 |
|
49 |
-
By default, transformers will load the model in full precision. Therefore you might be interested to further reduce down the memory requirements to run the model through the optimizations we offer in HF ecosystem:
|
50 |
-
|
51 |
## Cite
|
52 |
```
|
53 |
@article{wang2024my,
|
|
|
46 |
print(tokenizer.decode(output.sequences[0], skip_special_tokens=True))
|
47 |
```
|
48 |
|
|
|
|
|
49 |
## Cite
|
50 |
```
|
51 |
@article{wang2024my,
|