Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2305.19370

Foundation AI Papers

Curated List of Must-Reads on LLM reasoning at Temus AI team

Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

Paper • 2310.04406 • Published Oct 6, 2023 • 9
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 105
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization

Paper • 2402.09320 • Published Feb 14, 2024 • 6
Self-Discover: Large Language Models Self-Compose Reasoning Structures

Paper • 2402.03620 • Published Feb 6, 2024 • 115

LLM architecture

The Impact of Depth and Width on Transformer Language Model Generalization

Paper • 2310.19956 • Published Oct 30, 2023 • 10
Retentive Network: A Successor to Transformer for Large Language Models

Paper • 2307.08621 • Published Jul 17, 2023 • 170
RWKV: Reinventing RNNs for the Transformer Era

Paper • 2305.13048 • Published May 22, 2023 • 17
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 55

Blockwise Parallel Transformer for Long Context Large Models

Paper • 2305.19370 • Published May 30, 2023 • 3
Blockwise Self-Attention for Long Document Understanding

Paper • 1911.02972 • Published Nov 7, 2019 • 1
Blockwise Compression of Transformer-based Models without Retraining

Paper • 2304.01483 • Published Apr 4, 2023 • 1

TRAMS: Training-free Memory Selection for Long-range Language Modeling

Paper • 2310.15494 • Published Oct 24, 2023 • 2
A Long Way to Go: Investigating Length Correlations in RLHF

Paper • 2310.03716 • Published Oct 5, 2023 • 10
YaRN: Efficient Context Window Extension of Large Language Models

Paper • 2309.00071 • Published Aug 31, 2023 • 68
Giraffe: Adventures in Expanding Context Lengths in LLMs

Paper • 2308.10882 • Published Aug 21, 2023 • 1

Blockwise Parallel Transformer for Long Context Large Models

Paper • 2305.19370 • Published May 30, 2023 • 3

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs