YummyYum commited on
Commit
89dbebb
·
verified ·
1 Parent(s): 13fd13e

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -75
README.md CHANGED
@@ -1,6 +1,6 @@
1
  # Introduction
2
 
3
- MiniCPM_o_2.6-FlagOS-NVIDIA provides an all-in-one deployment solution, enabling execution of MiniCPM_o_2.6 on NVIDIA GPUs. As the first-generation release for the NVIDIA-H100, this package delivers two key features:
4
 
5
  1. Comprehensive Integration:
6
  - Integrated with FlagScale (https://github.com/FlagOpen/FlagScale).
@@ -29,13 +29,7 @@ We use a variety of Triton-implemented operation kernels—approximately 70%—t
29
 
30
  - Most Triton kernels are provided by FlagGems (https://github.com/FlagOpen/FlagGems). You can enable FlagGems kernels by setting the environment variable USE_FLAGGEMS. For more details, please refer to the "How to Run Locally" section.
31
 
32
- - Also included are Triton kernels from vLLM.
33
-
34
- # Bundle Download
35
-
36
- | | Usage | Nvidia |
37
- | ----------- | ------------------------------------------------------ | ------------------------------------------------------------ |
38
- | Basic Image | basic software environment that supports model running | 'docker pull flagrelease-registry.cn-beijing.cr.aliyuncs.com/flagrelease/flagrelease:deepseek-flagos-nvidia |
39
 
40
  # Evaluation Results
41
 
@@ -58,96 +52,65 @@ We use a variety of Triton-implemented operation kernels—approximately 70%—t
58
 
59
  ## 📌 Getting Started
60
 
61
- ### Download open-source weights
62
 
63
- ```
64
- pip install modelscope
65
- modelscope download --model <Model Name> --local_dir <Cache Path>
66
- ```
67
-
68
- ### Download the FlagOS image
69
-
70
- ```
71
- docker pull <IMAGE>
72
- ```
73
 
74
- ### Start the inference service
 
75
 
76
- ```
77
- docker run -itd --name flagrelease_nv --privileged --gpus all --net=host --ipc=host --device=/dev/infiniband --shm-size 512g --ulimit memlock=-1 -v <CKPT_PATH>:<CKPT_PATH> flagrelease-registry.cn-beijing.cr.aliyuncs.com/flagrelease/flagrelease:deepseek-flagos-nvidia /bin/bash
78
 
 
 
79
  docker exec -it flagrelease_nv /bin/bash
80
 
81
  conda activate flagscale-inference
82
  ```
83
 
 
84
  ### Download and install FlagGems
85
 
86
- ```
87
  git clone https://github.com/FlagOpen/FlagGems.git
88
  cd FlagGems
89
- pip install .
90
  cd ../
91
  ```
92
 
93
- ### Modify the configuration
94
 
95
- ```
96
- cd FlagScale/examples/minicpm_o_2.6/conf
97
- # Modify the configuration in config_minicpm_o_2.6.yaml
98
- defaults:
99
- - _self_
100
- - serve: minicpm_o_2.6
101
- experiment:
102
- exp_name: minicpm_o_2.6
103
- exp_dir: outputs/${experiment.exp_name}
104
- task:
105
- type: serve
106
- deploy:
107
- use_fs_serve: false
108
- runner:
109
- ssh_port: 22
110
- envs:
111
- CUDA_DEVICE_MAX_CONNECTIONS: 1
112
- cmds:
113
- before_start: source /root/miniconda3/bin/activate flagscale-inference && export USE_FLAGGEMS=1
114
- action: run
115
- hydra:
116
- run:
117
- dir: ${experiment.exp_dir}/hydra
118
  ```
119
 
120
- ```
121
- cd FlagScale/examples/minicpm_o_2.6/conf/serve
122
- # Modify the configuration in minicpm_o_2.6.yaml
123
- - serve_id: vllm_model
124
- engine: vllm
125
- engine_args:
126
- model: /models/MiniCPM_o_2 # path of weight of DeepSeek-R1-Distill-Qwen-32B
127
- served_model_name: minicpmo26-flagos
128
- tensor_parallel_size: 1
129
- pipeline_parallel_size: 1
130
- gpu_memory_utilization: 0.9
131
- max_num_seqs: 256
132
- limit_mm_per_prompt: image=18
133
- port: 9010
134
- trust_remote_code: true
135
- enable_chunked_prefill: true
136
 
137
- ```
 
 
 
 
 
 
 
 
 
138
 
139
- ```
140
  # install flagscale
141
- cd FlagScale/
142
  pip install .
143
 
144
- #【Verifiable on a single machine】
145
- ```
146
-
147
- ### Serve
148
-
149
- ```
150
- flagscale serve <Model>
151
  ```
152
 
153
  # Contributing
@@ -169,4 +132,4 @@ send "FlagRelease"
169
 
170
  # License
171
 
172
- This project and related model weights are licensed under the MIT License.
 
1
  # Introduction
2
 
3
+ MiniCPM_o_2.6-FlagOS-NVIDIA provides an all-in-one deployment solution, enabling execution of MiniCPM_o_2.6 on NVIDIA GPUs. As the first-generation release for the NVIDIA-H100, this package delivers three key features:
4
 
5
  1. Comprehensive Integration:
6
  - Integrated with FlagScale (https://github.com/FlagOpen/FlagScale).
 
29
 
30
  - Most Triton kernels are provided by FlagGems (https://github.com/FlagOpen/FlagGems). You can enable FlagGems kernels by setting the environment variable USE_FLAGGEMS. For more details, please refer to the "How to Run Locally" section.
31
 
32
+ - Also included are Triton kernels from vLLM, including fused MoE.
 
 
 
 
 
 
33
 
34
  # Evaluation Results
35
 
 
52
 
53
  ## 📌 Getting Started
54
 
55
+ ### Environment Setup
56
 
57
+ ```bash
58
+ # install FlagScale
59
+ git clone https://github.com/FlagOpen/FlagScale.git
60
+ cd FlagScale
61
+ pip install .
 
 
 
 
 
62
 
63
+ # download image and ckpt
64
+ flagscale pull --image docker pull flagrelease-registry.cn-beijing.cr.aliyuncs.com/flagrelease/flagrelease:deepseek-flagos-nvidia --ckpt https://www.modelscope.cn/models/FlagRelease/MiniCPM_o_2.6-FlagOS-Nvidia.git --ckpt-path /nfs/MiniCPM_o_2.6
65
 
66
+ # Note: For security reasons, this image does not have passwordless configuration. In multi-machine scenarios, you need to configure passwordless access for the image yourself.
 
67
 
68
+ # build and enter the container
69
+ docker run -itd --name flagrelease_nv --privileged --gpus all --net=host --ipc=host --device=/dev/infiniband --shm-size 512g --ulimit memlock=-1 -v /nfs:/nfs flagrelease-registry.cn-beijing.cr.aliyuncs.com/flagrelease/flagrelease:deepseek-flagos-nvidia /bin/bash
70
  docker exec -it flagrelease_nv /bin/bash
71
 
72
  conda activate flagscale-inference
73
  ```
74
 
75
+
76
  ### Download and install FlagGems
77
 
78
+ ```bash
79
  git clone https://github.com/FlagOpen/FlagGems.git
80
  cd FlagGems
81
+ pip install ./ --no-deps
82
  cd ../
83
  ```
84
 
85
+ ### Download FlagScale and build vllm
86
 
87
+ ```bash
88
+ git clone https://github.com/FlagOpen/FlagScale.git
89
+ cd FlagScale/
90
+ git checkout ae85925798358d95050773dfa66680efdb0c2b28
91
+ cd vllm
92
+ pip install .
93
+ cd ../
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
94
  ```
95
 
96
+ ### Serve
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
97
 
98
+ ```bash
99
+ # config the minicpm_o_2.6 yaml
100
+
101
+ FlagScale/
102
+ ├── examples/
103
+ │ └── minicpm_o_2.6/
104
+ │ └── conf/
105
+ │ └── config_minicpm_o_2.6.yaml # set hostfile and ssh_port(optional), if it is passwordless access between containers, the docker field needs to be removed
106
+ │ └── serve/
107
+ │ └── minicpm_o_2.6.yaml # set model parameters and server port
108
 
 
109
  # install flagscale
 
110
  pip install .
111
 
112
+ # serve
113
+ flagscale serve minicpm_o_2.6
 
 
 
 
 
114
  ```
115
 
116
  # Contributing
 
132
 
133
  # License
134
 
135
+ The weights of this model are based on OpenBMB/MiniCPM-o-2_6 and are open-sourced under the Apache 2.0 License: https://www.apache.org/licenses/LICENSE-2.0.txt.