AjayP13 commited on
Commit
e24a23f
·
verified ·
1 Parent(s): 4d0857f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -5
README.md CHANGED
@@ -30,18 +30,17 @@ This is an "Abstract to Tweet" model that crafts a tweet summarizing a research
30
  ```python3
31
  from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, pipeline
32
 
33
- tokenizer = AutoTokenizer.from_pretrained('datadreamer-dev/abstracts_to_tweet_model', revision=None) # Load tokenizer
34
- model = AutoModelForSeq2SeqLM.from_pretrained('datadreamer-dev/abstracts_to_tweet_model', revision=None) # Load model
35
- pipe = pipeline('text2text-generation', model=model, tokenizer=tokenizer, pad_token_id=tokenizer.pad_token_id)
36
 
37
- # For example, run the model on the abstract of the LoRA paper (https://arxiv.org/abs/2106.09685)
38
  abstract = "An important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example -- deploying independent instances of fine-tuned models, each with 175B parameters, is prohibitively expensive. We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks. Compared to GPT-3 175B fine-tuned with Adam, LoRA can reduce the number of trainable parameters by 10,000 times and the GPU memory requirement by 3 times. LoRA performs on-par or better than fine-tuning in model quality on RoBERTa, DeBERTa, GPT-2, and GPT-3, despite having fewer trainable parameters, a higher training throughput, and, unlike adapters, no additional inference latency. We also provide an empirical investigation into rank-deficiency in language model adaptation, which sheds light on the efficacy of LoRA. We release a package that facilitates the integration of LoRA with PyTorch models and provide our implementations and model checkpoints for RoBERTa, DeBERTa, and GPT-2 at this https URL."
39
  generated_tweet = pipe(abstract, max_length=512)['generated_text']
40
 
41
  # Print the generated tweet
42
  print(generated_tweet)
43
 
44
- # This will print:
45
  # "Exciting news in #NLP! We've developed Low-Rank Adaptation, or LoRA, to reduce the number of
46
  # trainable parameters for downstream tasks. It reduces model weights by 10,000 times and GPU
47
  # memory by 3 times. #AI #MachineLearning"
 
30
  ```python3
31
  from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, pipeline
32
 
33
+ # Load model
34
+ pipe = pipeline('text2text-generation', 'datadreamer-dev/abstracts_to_tweet_model')
 
35
 
36
+ # Generate a tweet from the abstract of the LoRA paper
37
  abstract = "An important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example -- deploying independent instances of fine-tuned models, each with 175B parameters, is prohibitively expensive. We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks. Compared to GPT-3 175B fine-tuned with Adam, LoRA can reduce the number of trainable parameters by 10,000 times and the GPU memory requirement by 3 times. LoRA performs on-par or better than fine-tuning in model quality on RoBERTa, DeBERTa, GPT-2, and GPT-3, despite having fewer trainable parameters, a higher training throughput, and, unlike adapters, no additional inference latency. We also provide an empirical investigation into rank-deficiency in language model adaptation, which sheds light on the efficacy of LoRA. We release a package that facilitates the integration of LoRA with PyTorch models and provide our implementations and model checkpoints for RoBERTa, DeBERTa, and GPT-2 at this https URL."
38
  generated_tweet = pipe(abstract, max_length=512)['generated_text']
39
 
40
  # Print the generated tweet
41
  print(generated_tweet)
42
 
43
+ # Output:
44
  # "Exciting news in #NLP! We've developed Low-Rank Adaptation, or LoRA, to reduce the number of
45
  # trainable parameters for downstream tasks. It reduces model weights by 10,000 times and GPU
46
  # memory by 3 times. #AI #MachineLearning"