brycebywang commited on
Commit
c246cd6
Β·
verified Β·
1 Parent(s): 75a460c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +160 -3
README.md CHANGED
@@ -1,3 +1,160 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!-- markdownlint-disable first-line-h1 -->
2
+ <!-- markdownlint-disable html -->
3
+ <!-- markdownlint-disable no-duplicate-header -->
4
+
5
+ # MatrixGame-V1: Interactive World Foundation Model
6
+ <div style="display: flex; justify-content: center; gap: 10px;">
7
+ <a href="https://github.com/SkyworkAI/MatrixGame-V1">
8
+ <img src="https://img.shields.io/badge/GitHub-100000?style=flat&logo=github&logoColor=white" alt="GitHub">
9
+ </a>
10
+ <a href="#todo">
11
+ <img src="https://img.shields.io/badge/arXiv-Report-b31b1b?style=flat&logo=arxiv&logoColor=white" alt="arXiv">
12
+ </a>
13
+ </div>
14
+
15
+ ## πŸ“ Overview
16
+ **MatrixGame** is a 17B-parameter Diffusion Transformer for generating high-resolution, physics-consistent videos in interactive game environments. Trained on large-scale data from Minecraft and Unreal Engine, it understands game physics like collisions, destruction, and item placement. MatrixGame supports real-time, action-conditioned generation, adapting video content dynamically to user input.
17
+
18
+ You can find more visualizations on our [website](#).
19
+
20
+ ## πŸ”₯ Latest Updates
21
+
22
+ * [2025-05] πŸŽ‰ Initial release of MatrixGame-V1
23
+
24
+ ## πŸš€ Performance Comparison
25
+ ### GameWorld Score Benchmark Comparison
26
+
27
+ | Model | Image Quality ↑ | Aesthetic ↑ | Temporal Cons. ↑ | Motion Smooth. ↑ | Keyboard Acc. ↑ | Mouse Acc. ↑ | 3D Cons. ↑ |
28
+ |-----------|------------------|-------------|-------------------|-------------------|------------------|---------------|-------------|
29
+ | Oasis | 0.65 | 0.48 | 0.94 | **0.98** | 0.77 | 0.56 | 0.56 |
30
+ | MineWorld | 0.69 | 0.47 | 0.95 | **0.98** | 0.86 | 0.64 | 0.51 |
31
+ | **Ours** | **0.72** | **0.49** | **0.97** | **0.98** | **0.95** | **0.95** | **0.76** |
32
+
33
+ **Metric Descriptions**:
34
+
35
+ - **Image Quality** / **Aesthetic**: Visual fidelity and perceptual appeal of generated frames
36
+ - **Temporal Cons.** / **Motion Smooth.**: Temporal coherence and smoothness between frames
37
+ - **Keyboard Acc.** / **Mouse Acc.**: Accuracy in following user control signals
38
+ - **3D Cons.**: Geometric stability and physical plausibility over time
39
+
40
+ ### Human Evaluation
41
+ <table>
42
+ <thead>
43
+ <tr>
44
+ <th>Group</th>
45
+ <th>Method</th>
46
+ <th>Overall Quality (%)</th>
47
+ <th>Controllability (%)</th>
48
+ <th>Visual Quality (%)</th>
49
+ <th>Temporal Consistency (%)</th>
50
+ </tr>
51
+ </thead>
52
+ <tbody>
53
+ <tr>
54
+ <td rowspan="3">Group A</td>
55
+ <td>Oasis</td>
56
+ <td>0.16</td>
57
+ <td>0.33</td>
58
+ <td>0.00</td>
59
+ <td>0.16</td>
60
+ </tr>
61
+ <tr>
62
+ <td>MineWorld</td>
63
+ <td>3.78</td>
64
+ <td>5.58</td>
65
+ <td>1.32</td>
66
+ <td>13.82</td>
67
+ </tr>
68
+ <tr>
69
+ <td><strong>Ours</strong></td>
70
+ <td><strong>96.05</strong></td>
71
+ <td><strong>94.09</strong></td>
72
+ <td><strong>98.68</strong></td>
73
+ <td><strong>86.02</strong></td>
74
+ </tr>
75
+ <tr>
76
+ <td rowspan="3">Group B</td>
77
+ <td>Oasis</td>
78
+ <td>0.66</td>
79
+ <td>0.82</td>
80
+ <td>0.75</td>
81
+ <td>0.66</td>
82
+ </tr>
83
+ <tr>
84
+ <td>MineWorld</td>
85
+ <td>2.79</td>
86
+ <td>5.76</td>
87
+ <td>1.48</td>
88
+ <td>6.25</td>
89
+ </tr>
90
+ <tr>
91
+ <td><strong>Ours</strong></td>
92
+ <td><strong>96.55</strong></td>
93
+ <td><strong>93.42</strong></td>
94
+ <td><strong>97.77</strong></td>
95
+ <td><strong>93.09</strong></td>
96
+ </tr>
97
+ <tr>
98
+ <td rowspan="3">Average</td>
99
+ <td>Oasis</td>
100
+ <td>0.41</td>
101
+ <td>0.58</td>
102
+ <td>0.38</td>
103
+ <td>0.41</td>
104
+ </tr>
105
+ <tr>
106
+ <td>MineWorld</td>
107
+ <td>3.29</td>
108
+ <td>5.67</td>
109
+ <td>1.40</td>
110
+ <td>10.04</td>
111
+ </tr>
112
+ <tr>
113
+ <td><strong>Ours</strong></td>
114
+ <td><strong>96.30</strong></td>
115
+ <td><strong>93.76</strong></td>
116
+ <td><strong>98.23</strong></td>
117
+ <td><strong>89.56</strong></td>
118
+ </tr>
119
+ </tbody>
120
+ </table>
121
+
122
+ > Double-blind human evaluation by two independent groups across four key dimensions: **Overall Quality**, **Controllability**, **Visual Quality**, and **Temporal Consistency**.
123
+ > Scores represent the percentage of pairwise comparisons in which each method was preferred. MatrixGame consistently outperforms prior models across all metrics and both groups.
124
+
125
+
126
+ ## πŸ› οΈ Installation
127
+
128
+ 1. Clone the repository:
129
+ ```bash
130
+ git clone https://github.com/SkyworkAI/MatrixGame-V1.git
131
+ cd MatrixGame-V1
132
+ ```
133
+
134
+ 2. Install dependencies:
135
+ ```bash
136
+ pip install -r requirements.txt
137
+ ```
138
+
139
+ ## πŸš€ Quick Start
140
+
141
+ ```bash
142
+ bash run_inference.sh
143
+ ```
144
+
145
+ ## 🀝 Contributing
146
+
147
+ We welcome contributions! Please see our [contributing guidelines](CONTRIBUTING.md) for more details.
148
+
149
+ ## ⭐ Acknowledgements
150
+
151
+ We would like to express our gratitude to:
152
+
153
+ - [Diffusers](https://github.com/huggingface/diffusers) for their excellent diffusion model framework
154
+ - [HunyuanVideo](https://github.com/Tencent/HunyuanVideo) for their strong base model
155
+
156
+ We are grateful to the broader research community for their open exploration and contributions to the field of interactive world generation.
157
+
158
+ ## πŸ“„ License
159
+
160
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.