| license: mit | |
| datasets: | |
| - conceptual_captions | |
| - sbu_captions | |
| - visual_genome | |
| language: | |
| - en | |
| tags: | |
| - ManagerTower | |
| Model weights for ACL 2023 Oral Paper: [ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning](https://arxiv.org/abs/2306.00103). | |
| Additional materials: [Code](https://github.com/LooperXX/ManagerTower), [Slides](https://looperxx.github.io/files/ManagerTower-ACL23-PPT-2023-06-EN-12min.pdf), [Video(EN)](https://youtu.be/SOHprfiiClQ), [Video(CN)](https://www.bilibili.com/video/BV17s4y1y7Ny), [Blog(CN)](http://looperxx.github.io/blog/ManagerTower), [Tweet(EN)](https://twitter.com/looperxx27/status/1678341890809401346). | |