MLAdaptiveIntelligence
/

LLaVAction-7B

Video-Text-to-Text

text-generation

text-generation-inference

Model card Files Files and versions

mwmathis commited on Mar 24

Commit

c712fec

·

verified ·

1 Parent(s): 73e5f3d

Update README.md

Files changed (1) hide show

README.md +18 -0

README.md CHANGED Viewed

@@ -95,6 +95,24 @@ model-index:
 # LLaVAction-7B
 ## Model Summary
 The LLaVAction-7B model is trained on EPIC-KITCHENS-100-MQA, based on Qwen2 language model with a context window of 32K tokens.
 This model supports at most 64 frames.

 # LLaVAction-7B
+<div align="center">
+<h2>LLaVAction: evaluating and training multi-modal large language models for action recognition
+</h2>
+[Shaokai Ye](https://jwyang.github.io/)<sup>*</sup><sup>1</sup>&nbsp;
+[Haozhe Qi](https://cs-people.bu.edu/rxtan/)<sup>*</sup><sup>1</sup>&nbsp;
+[Alexander Mathis](https://qianhuiwu.github.io/)<sup>1</sup><sup>†</sup>&nbsp;
+[Mackenzie Weygandt Mathis](https://ruijiezheng.com/)<sup>1</sup><sup>†</sup><sup>‡</sup>&nbsp;
+<sup>1</sup> EPFL
+<sup>*</sup> First authors  <sup>†</sup> Senior Authors  <sup>‡</sup> Corresponding Author
+\[[arXiv Paper](https://www.arxiv.org/tbd)\] &nbsp; \[[Project Page](https://mmathislab.github.io/llavaction/)\] &nbsp; \[[Github Repo](https://github.com/AdaptiveMotorControlLab/LLaVAction)\] &nbsp;
+</div>
 ## Model Summary
 The LLaVAction-7B model is trained on EPIC-KITCHENS-100-MQA, based on Qwen2 language model with a context window of 32K tokens.
 This model supports at most 64 frames.