Zhejiang University

community

https://www.zju.edu.cn/english/

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

stevenbucaille updated a Space 5 days ago

zju-community/efficientloftr

stevenbucaille published a Space 5 days ago

zju-community/efficientloftr

stevenbucaille new activity 7 days ago

zju-community/matchanything_eloftr:Can’t reproduce the same results as on the web interface with code

View all activity

merve

posted an update about 11 hours ago

Post

178

large AI labs have dropped so many open models last week 🔥 don't miss out on them

→ Apple released on-device vision LMs apple/fastvlm-68ac97b9cd5cacefdd04872e & apple/mobileclip2-68ac947dcb035c54bcd20c47
→ OpenGVLab released InternVL3.5, 32 new vision LMs with one based on gpt-oss! (OS) OpenGVLab/internvl35-68ac87bd52ebe953485927fb
→ MSFT released a killer small TTS model (OS) microsoft/VibeVoice-1.5B

find more herehttps://huggingface.co/collections/merve/august-29-releases-68b5a3754cfb8abf59e2b486

stevenbucaille

updated a Space 5 days ago

Efficientloftr Demo

↔

Demo Space for EfficientLoFTR architecture in Transformers

stevenbucaille

published a Space 5 days ago

Efficientloftr Demo

↔

Demo Space for EfficientLoFTR architecture in Transformers

merve

posted an update 7 days ago

Post

5782

first vision language model built off openai/gpt-oss-20b just dropped! 🔥

InternVL3.5 comes with 32 models 🤯 pre-trained, fine-tuned, aligned in various sizes OpenGVLab/internvl35-68ac87bd52ebe953485927fb
comes with gpt-oss or Qwen3 for LLM part ⤵️

1 reply

stevenbucaille

in zju-community/matchanything_eloftr 7 days ago

Can’t reproduce the same results as on the web interface with code

#4 opened 9 days ago by

gfdadfas

stevenbucaille

in zju-community/matchanything_eloftr 12 days ago

Bad Output

#3 opened 12 days ago by

jpbalarini

stevenbucaille

updated a model 12 days ago

zju-community/matchanything_eloftr

0.0B • Updated 12 days ago • 1.38k • 69

stevenbucaille

in zju-community/efficientloftr 12 days ago

Pretrained model

#3 opened 13 days ago by

Stefanvdp

stevenbucaille

in zju-community/matchanything_eloftr 12 days ago

ROMA safetensors

#1 opened 15 days ago by

alexdzm

qubvel-hf

updated a model 15 days ago

zju-community/efficientloftr

0.0B • Updated 15 days ago • 2.74k • 18

qubvel-hf

in zju-community/matchanything_eloftr 15 days ago

Demo does not run

#2 opened 15 days ago by

alexdzm

qubvel-hf

updated a model 15 days ago

zju-community/matchanything_eloftr

0.0B • Updated 12 days ago • 1.38k • 69

stevenbucaille

in zju-community/efficientloftr 19 days ago

Unable to recreate example.

#2 opened 21 days ago by

Boguc123

stevenbucaille

updated a model 19 days ago

zju-community/efficientloftr

0.0B • Updated 15 days ago • 2.74k • 18

stevenbucaille

published a model 19 days ago

zju-community/matchanything_eloftr

0.0B • Updated 12 days ago • 1.38k • 69

merve

posted an update 26 days ago

Post

3229

GPT-4.1-mini level model right in your iPhone 🤯

openbmb/MiniCPM-V-4 is only 4B while surpassing GPT-4.1-mini in vision benchmarks 🔥

allows commercial use as well!

merve

posted an update 28 days ago

Post

1118

we're all sleeping on this OCR model rednote-hilab/dots.ocr 🔥

dots.ocr is a new 3B model with sota performance, support for 100 languages & allowing commercial use! 🤯

single e2e model to extract image, convert tables, formula, and more into markdown 📝
try it MohamedRashad/Dots-OCR

merve

posted an update 28 days ago

Post

654

massive releases and tons of Flux 1. Krea LoRas past week!
here's some of the picks, find more models in collection 🫡 merve/releases-august-2-6890c14248203522b7d0267f

LLMs 💬
> Tencent dropped tencent/Hunyuan-7B-Instruct
> Qwen released Qwen/Qwen3-Coder-30B-A3B-Instruct, 30B MoE with 3B params for coding (OS)

vision/multimodal
> RedNote released rednote-hilab/dots.ocr - 3B OCR model (OS)
> Cohere released CohereLabs/command-a-vision-07-2025 - 112B (dense!) VLM for 6 languages
> StepFun-AI shipped stepfun-ai/step3 - 321B MoE VLM (OS)
> Skywork shipped Skywork/Skywork-UniPic-1.5B - new any-to-any model (image+text → image+text) (OS)

merve

posted an update about 1 month ago

Post

2231

Cohere just dropped CohereLabs/command-a-vision-07-2025, a 112B (dense!) vision LM
> based on SigLIP2 & Command-A
> built for enterprise use cases 🔥
> use with Inference Providers or transformers 🤗
read their blog https://huggingface.co/blog/CohereLabs/introducing-command-a-vision-07-2025

2 replies

merve

posted an update about 1 month ago

Post

3611

past week in open AI was insane 🔥 here's some of picks, find more here merve/releases-july-25-688768ca47fe3693407e02d1

💬 LLMs & VLMs
> Qwen/Qwen3-235B-A22B-Thinking-2507 had a new update (OS)
> Qwen/Qwen3-Coder-480B-A35B-Instruct is out with 480B total 35B active params 🤯 (OS)
> AllenAI dropped an update to allenai/olmOCR-7B-0725 📝
> InternLM released internlm/Intern-S1 - 235B Qwen3 MoE + 6B InternViT encoder (OS)
> OmniSVG/OmniSVG is a new SVG generation VLM (OS)

🖼️ image/video/3D generation
> WanAI released Wan2.2 series - both T2V and I2V 14B models for high-quality video generation (OS) multimodalart/wan-22-688767e313337b434ed55112
> Tencent dropped tencent/HunyuanWorld-1 - image-to-3D scene generation model

1 reply

AI & ML interests

Recent Activity

Team members 4

zju-community's activity

Efficientloftr Demo

Efficientloftr Demo

Can’t reproduce the same results as on the web interface with code

Bad Output

Pretrained model

ROMA safetensors

Demo does not run

Unable to recreate example.