English
M3C-retrieval / README.md
jihyoung's picture
Update README.md
eaf73b0 verified
metadata
license: cc-by-4.0
datasets:
  - jihyoung/M3C
language:
  - en
base_model:
  - Qwen/Qwen2-VL-2B-Instruct

Enabling Chatbots with Eyes and Ears: An Immersive Multimodal Conversation System for Dynamic Interactions

[๐Ÿ“œ Paper] [๐Ÿ–ฅ๏ธ Project Page] [๐Ÿ“– Dataset] [๐Ÿค— Model Weights]

image
Image Generated by DALLยทE

โœ… TODO List

  • Write documentation (README)
  • Release MยณC dataset
  • Release dialogue module weight
  • Release retrieval module weight
  • Release training code
  • Release inference code
  • Release model self-chat code
  • Launch Gradio demo for live chat

๐Ÿ“š Citation

@article{jang2025enabling,
  title={Enabling Chatbots with Eyes and Ears: An Immersive Multimodal Conversation System for Dynamic Interactions},
  author={Jang, Jihyoung and Bae, Minwook and Kim, Minji and Hakkani-Tur, Dilek and Kim, Hyounghun},
  journal={arXiv preprint arXiv:2506.00421},
  year={2025}
}