DDRO-Reference-Policies (SFT) Collection Step-2 SFT reference policies (π_ref) used to initialize DDRO (MS MARCO / NQ; PQ and Title+URL DocIDs); use these for fair comparisons/ablations. • 4 items • Updated 17 days ago
DDRO-Reference-Policies (SFT) Collection Step-2 SFT reference policies (π_ref) used to initialize DDRO (MS MARCO / NQ; PQ and Title+URL DocIDs); use these for fair comparisons/ablations. • 4 items • Updated 17 days ago
DDRO-Reference-Policies (SFT) Collection Step-2 SFT reference policies (π_ref) used to initialize DDRO (MS MARCO / NQ; PQ and Title+URL DocIDs); use these for fair comparisons/ablations. • 4 items • Updated 17 days ago
DDRO-Reference-Policies (SFT) Collection Step-2 SFT reference policies (π_ref) used to initialize DDRO (MS MARCO / NQ; PQ and Title+URL DocIDs); use these for fair comparisons/ablations. • 4 items • Updated 17 days ago
DDRO-Generative-Document-Retrieval Collection Step-3 DDRO optimized checkpoints (final policy) + accompanying datasets/artifacts (docIDs, pseudo-queries, testsets) to reproduce the paper. • 8 items • Updated 17 days ago • 1