MIKHAIL BURTSEV's picture

7 12 8

MIKHAIL BURTSEV

mbur

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 7 days ago

Limitations of Normalization in Attention Mechanism

commented on a paper 7 days ago

Limitations of Normalization in Attention Mechanism

commented on a paper 7 days ago

Limitations of Normalization in Attention Mechanism

View all activity

Organizations

upvoted 2 papers 7 days ago

Limitations of Normalization in Attention Mechanism

Paper • 2508.17821 • Published 8 days ago • 5

Beyond Memorization: Extending Reasoning Depth with Recurrence, Memory and Test-Time Compute Scaling

Paper • 2508.16745 • Published 11 days ago • 22

upvoted an article 3 months ago

Article

You could have designed state of the art positional encoding

By

•

Nov 25, 2024

• 352

upvoted 3 papers 7 months ago

Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity

Paper • 2502.13063 • Published Feb 18 • 73

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 418

SRMT: Shared Memory for Multi-agent Lifelong Pathfinding

Paper • 2501.13200 • Published Jan 22 • 69

upvoted 4 papers about 1 year ago

Associative Recurrent Memory Transformer

Paper • 2407.04841 • Published Jul 5, 2024 • 37

AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents

Paper • 2407.04363 • Published Jul 5, 2024 • 34

Complexity of Symbolic Representation in Working Memory of Transformer Correlates with the Complexity of a Task

Paper • 2406.14213 • Published Jun 20, 2024 • 21

BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack

Paper • 2406.10149 • Published Jun 14, 2024 • 53

upvoted a collection over 1 year ago

DNA language models

9 items • Updated Apr 17, 2024 • 7

upvoted a paper over 1 year ago

In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss

Paper • 2402.10790 • Published Feb 16, 2024 • 43