Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated 16 days ago • 560
Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models Paper • 2501.13629 • Published Jan 23 • 44
McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervised Sentence Similarity • Updated Apr 30, 2024 • 12k • 48
Robust Multi-bit Text Watermark with LLM-based Paraphrasers Paper • 2412.03123 • Published Dec 4, 2024 • 6