FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading Paper • 2502.11433 • Published 25 days ago • 31
DPO-Shift: Shifting the Distribution of Direct Preference Optimization Paper • 2502.07599 • Published about 1 month ago • 15
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 108
Diffusion Model Alignment Using Direct Preference Optimization Paper • 2311.12908 • Published Nov 21, 2023 • 50
Facial Recognition Collection Face detection and recognition models that can be used for facial recognition in Immich. Models are sorted by size in descending order. • 4 items • Updated Nov 10, 2023 • 8
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models Paper • 2402.19481 • Published Feb 29, 2024 • 22