1 13 21

Hugo Pitorro

twigs

https://pitorro.de

AI & ML interests

None yet

Recent Activity

liked a model 5 days ago

Tongyi-Zhiwen/QwenLong-L1.5-30B-A3B

liked a dataset 4 months ago

princeton-nlp/prolong-data-64K

upvoted a paper 6 months ago

Should We Still Pretrain Encoders with Masked Language Modeling?

View all activity

Organizations

None yet

liked a model 5 days ago

Tongyi-Zhiwen/QwenLong-L1.5-30B-A3B

31B • Updated 16 days ago • 14.2k • 146

liked a dataset 4 months ago

princeton-nlp/prolong-data-64K

Updated Oct 5, 2024 • 11.9k • 20

upvoted a paper 6 months ago

Should We Still Pretrain Encoders with Masked Language Modeling?

Paper • 2507.00994 • Published Jul 1 • 80

liked a dataset 6 months ago

EleutherAI/proof-pile-2

Updated Oct 25, 2023 • 5.07k • 209

updated a dataset 7 months ago

twigs/openmathinstruct2_chat_50k

Viewer • Updated May 29 • 500k • 7

published a dataset 7 months ago

twigs/openmathinstruct2_chat_50k

Viewer • Updated May 29 • 500k • 7

liked 2 datasets 7 months ago

tokyotech-llm/swallow-math

Viewer • Updated May 10 • 4.33M • 1.83k • 38

tokyotech-llm/swallow-code

Viewer • Updated Jul 4 • 129M • 2.83k • 59

liked a dataset 8 months ago

PrimeIntellect/SYNTHETIC-1-SFT-Data

Viewer • Updated Feb 21 • 894k • 326 • 34

upvoted a paper 10 months ago

RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153

authored a paper 10 months ago

How Effective are State Space Models for Machine Translation?

Paper • 2407.05489 • Published Jul 7, 2024

commented a paper 10 months ago

LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models

Paper • 2502.15612 • Published Feb 21 • 4 •

authored a paper 10 months ago

LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models

Paper • 2502.15612 • Published Feb 21 • 4

upvoted a paper 10 months ago

LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models

Paper • 2502.15612 • Published Feb 21 • 4

upvoted 2 papers 11 months ago

OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Paper • 2502.01061 • Published Feb 3 • 222

Scalable-Softmax Is Superior for Attention

Paper • 2501.19399 • Published Jan 31 • 24

liked a model 11 months ago

mistralai/Mistral-Small-24B-Instruct-2501

24B • Updated Jul 28 • 1.04M • 949

liked a dataset 11 months ago

deepmind/narrativeqa

Viewer • Updated Mar 6, 2024 • 28.7k • 11.1k • 60

liked 2 models about 1 year ago

deepseek-ai/DeepSeek-V3-Base

685B • Updated Mar 27 • 4.5k • 1.68k

nvidia/mamba2-8b-3t-4k

Text Generation • Updated Jun 13, 2024 • 21

Hugo Pitorro

AI & ML interests

Recent Activity

Organizations

twigs's activity