Shobhank-iiitdwd
/

long-t5-tglobal-base-16384-book-summary

@@ -185,7 +185,7 @@ parameters:
   encoder_no_repeat_ngram_size: 3
   num_beams: 4
 model-index:
-- name: pszemraj/long-t5-tglobal-base-16384-book-summary
   results:
   - task:
       type: summarization
@@ -499,7 +499,7 @@ from transformers import pipeline
 summarizer = pipeline(
     "summarization",
-    "pszemraj/long-t5-tglobal-base-16384-book-summary",
     device=0 if torch.cuda.is_available() else -1,
 )
 long_text = "Here is a lot of text I don't want to read. Replace me"
@@ -508,37 +508,6 @@ result = summarizer(long_text)
 print(result[0]["summary_text"])
 ```
-Pass [other parameters related to beam search textgen](https://huggingface.co/blog/how-to-generate) when calling `summarizer` to get even higher quality results.
-## Intended uses & limitations
-- The current checkpoint is fairly well converged but will be updated if further improvements can be made.
-    - Compare performance to [LED-base](https://huggingface.co/pszemraj/led-base-book-summary) trained on the same dataset (API gen parameters are the same).
-- while this model seems to improve upon factual consistency, **do not take summaries to be foolproof and check things that seem odd**.
-## Training and evaluation data
-`kmfoda/booksum` dataset on HuggingFace - read [the original paper here](https://arxiv.org/abs/2105.08209). Summaries longer than 1024 LongT5 tokens were filtered out to prevent the model from learning to generate "partial" summaries.
-### How to run inference over a very long (30k+ tokens) document in batches?
-See `summarize.py` in [the code for my hf space Document Summarization](https://huggingface.co/spaces/pszemraj/document-summarization/blob/main/summarize.py) :)
-You can also use the same code to split a document into batches of 4096, etc., and run over those with the model. This is useful in situations where CUDA memory is limited.
-### How to fine-tune further?
-See [train with a script](https://huggingface.co/docs/transformers/run_scripts) and [the summarization scripts](https://github.com/huggingface/transformers/tree/main/examples/pytorch/summarization).
-This model was originally tuned on Google Colab with a heavily modified variant of the [longformer training notebook](https://github.com/patrickvonplaten/notebooks/blob/master/Fine_tune_Longformer_Encoder_Decoder_(LED)_for_Summarization_on_pubmed.ipynb), key enabler being deepspeed. You can try this as an alternate route to fine-tuning the model without using the command line.
-* * *
-## Training procedure
 ### Training hyperparameters
 _NOTE: early checkpoints of this model were trained on a "smaller" subsection of the dataset as it was filtered for summaries of **1024 characters**. This was subsequently caught and adjusted to **1024 tokens** and then trained further for 10+ epochs._

   encoder_no_repeat_ngram_size: 3
   num_beams: 4
 model-index:
+- name: Shobhank-iiitdwd/long-t5-tglobal-base-16384-book-summary
   results:
   - task:
       type: summarization
 summarizer = pipeline(
     "summarization",
+    "Shobhank-iiitdwd/long-t5-tglobal-base-16384-book-summary",
     device=0 if torch.cuda.is_available() else -1,
 )
 long_text = "Here is a lot of text I don't want to read. Replace me"
 print(result[0]["summary_text"])
 ```
 ### Training hyperparameters
 _NOTE: early checkpoints of this model were trained on a "smaller" subsection of the dataset as it was filtered for summaries of **1024 characters**. This was subsequently caught and adjusted to **1024 tokens** and then trained further for 10+ epochs._