Disen Lan

landisen

AI & ML interests

None yet

Recent Activity

published a model about 4 hours ago

linear-moe-hub/Lolcats-Llama-8B

published a model 1 day ago

landisen/Qwen2.5-1.5B-Open-R1-Distill

published a model 1 day ago

landisen/Qwen2.5-7B-Open-R1-GRPO

View all activity

Organizations

landisen's activity

published a model about 4 hours ago

linear-moe-hub/Lolcats-Llama-8B

Updated about 4 hours ago

published 2 models 1 day ago

landisen/Qwen2.5-1.5B-Open-R1-Distill

Updated 1 day ago

landisen/Qwen2.5-7B-Open-R1-GRPO

Updated 1 day ago

upvoted a paper 3 days ago

Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts

Paper • 2503.05447 • Published 6 days ago • 7

authored a paper 3 days ago

Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts

Paper • 2503.05447 • Published 6 days ago • 7

updated a model 6 days ago

linear-moe-hub/Liger-GLA-8B

Text Generation • Updated 6 days ago • 3

updated a model 7 days ago

linear-moe-hub/Liger-GSA-8B

Text Generation • Updated 7 days ago • 31 • 2

authored a paper 9 days ago

Liger: Linearizing Large Language Models to Gated Recurrent Structures

Paper • 2503.01496 • Published 10 days ago • 15

upvoted a paper 9 days ago

Liger: Linearizing Large Language Models to Gated Recurrent Structures

Paper • 2503.01496 • Published 10 days ago • 15

liked a model 14 days ago

linear-moe-hub/Liger-GSA-8B

Text Generation • Updated 7 days ago • 31 • 2

published a model 14 days ago

linear-moe-hub/Liger-GSA-8B

Text Generation • Updated 7 days ago • 31 • 2

liked a model 15 days ago

linear-moe-hub/Liger-GLA-8B

Text Generation • Updated 6 days ago • 3

published a model 15 days ago

linear-moe-hub/Liger-GLA-8B

Text Generation • Updated 6 days ago • 3

authored a paper 21 days ago

MoM: Linear Sequence Modeling with Mixture-of-Memories

Paper • 2502.13685 • Published 22 days ago • 33

upvoted a paper 21 days ago

MoM: Linear Sequence Modeling with Mixture-of-Memories

Paper • 2502.13685 • Published 22 days ago • 33

authored a paper 28 days ago

LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid

Paper • 2502.07563 • Published about 1 month ago • 24

upvoted a paper 28 days ago

LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid

Paper • 2502.07563 • Published about 1 month ago • 24

upvoted a paper about 1 month ago

UltraIF: Advancing Instruction Following from the Wild

Paper • 2502.04153 • Published Feb 6 • 22

upvoted a paper about 2 months ago

Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback

Paper • 2501.12895 • Published Jan 22 • 57

liked a model 3 months ago

OpenGVLab/InternViT-6B-224px

Image Feature Extraction • Updated Dec 9, 2024 • 284 • 24