jihyoung
/

M3C-retrieval

Model card Files Files and versions

M3C-retrieval / README.md

jihyoung's picture

Update README.md

eaf73b0 verified 3 months ago

|

history blame contribute delete

1.29 kB

	---
	license: cc-by-4.0
	datasets:
	- jihyoung/M3C
	language:
	- en
	base_model:
	- Qwen/Qwen2-VL-2B-Instruct
	---

	# Enabling Chatbots with Eyes and Ears: An Immersive Multimodal Conversation System for Dynamic Interactions
	[\[📜 Paper\]](https://arxiv.org/abs/2506.00421) [\[🖥️ Project Page\]](https://m3c-dataset.github.io/) [\[📖 Dataset\]](https://huggingface.co/datasets/jihyoung/M3C) [\[🤗 Model Weights\]](https://huggingface.co/jihyoung/M3C-dialogue)

	<div align="center">
	<img width="500" alt="image" src="https://github.com/user-attachments/assets/76309007-498a-45ed-a4c2-dac43ee39bfc">
	<br>
	<sub>Image Generated by DALL·E</sub>
	</div>

	## ✅ TODO List

	- [ ] Write documentation (README)
	- [ ] Release M³C dataset
	- [ ] Release dialogue module weight
	- [ ] Release retrieval module weight
	- [ ] Release training code
	- [ ] Release inference code
	- [ ] Release model self-chat code
	- [ ] Launch Gradio demo for live chat

	## 📚 Citation

	```bibtex
	@article{jang2025enabling,
	title={Enabling Chatbots with Eyes and Ears: An Immersive Multimodal Conversation System for Dynamic Interactions},
	author={Jang, Jihyoung and Bae, Minwook and Kim, Minji and Hakkani-Tur, Dilek and Kim, Hyounghun},
	journal={arXiv preprint arXiv:2506.00421},
	year={2025}
	}