Upload README.md with huggingface_hub
Browse files
    	
        README.md
    CHANGED
    
    | 
         @@ -518,12 +518,12 @@ LMDeploy abstracts the complex inference process of multi-modal Vision-Language 
     | 
|
| 518 | 
         
             
            #### A 'Hello, world' Example
         
     | 
| 519 | 
         | 
| 520 | 
         
             
            ```python
         
     | 
| 521 | 
         
            -
            from lmdeploy import pipeline, TurbomindEngineConfig
         
     | 
| 522 | 
         
             
            from lmdeploy.vl import load_image
         
     | 
| 523 | 
         | 
| 524 | 
         
             
            model = 'OpenGVLab/InternVL3-78B'
         
     | 
| 525 | 
         
             
            image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
         
     | 
| 526 | 
         
            -
            pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=16384, tp=4))
         
     | 
| 527 | 
         
             
            response = pipe(('describe this image', image))
         
     | 
| 528 | 
         
             
            print(response.text)
         
     | 
| 529 | 
         
             
            ```
         
     | 
| 
         @@ -535,12 +535,12 @@ If `ImportError` occurs while executing this case, please install the required d 
     | 
|
| 535 | 
         
             
            When dealing with multiple images, you can put them all in one list. Keep in mind that multiple images will lead to a higher number of input tokens, and as a result, the size of the context window typically needs to be increased.
         
     | 
| 536 | 
         | 
| 537 | 
         
             
            ```python
         
     | 
| 538 | 
         
            -
            from lmdeploy import pipeline, TurbomindEngineConfig
         
     | 
| 539 | 
         
             
            from lmdeploy.vl import load_image
         
     | 
| 540 | 
         
             
            from lmdeploy.vl.constants import IMAGE_TOKEN
         
     | 
| 541 | 
         | 
| 542 | 
         
             
            model = 'OpenGVLab/InternVL3-78B'
         
     | 
| 543 | 
         
            -
            pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=16384, tp=4))
         
     | 
| 544 | 
         | 
| 545 | 
         
             
            image_urls=[
         
     | 
| 546 | 
         
             
                'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg',
         
     | 
| 
         @@ -558,11 +558,11 @@ print(response.text) 
     | 
|
| 558 | 
         
             
            Conducting inference with batch prompts is quite straightforward; just place them within a list structure:
         
     | 
| 559 | 
         | 
| 560 | 
         
             
            ```python
         
     | 
| 561 | 
         
            -
            from lmdeploy import pipeline, TurbomindEngineConfig
         
     | 
| 562 | 
         
             
            from lmdeploy.vl import load_image
         
     | 
| 563 | 
         | 
| 564 | 
         
             
            model = 'OpenGVLab/InternVL3-78B'
         
     | 
| 565 | 
         
            -
            pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=16384, tp=4))
         
     | 
| 566 | 
         | 
| 567 | 
         
             
            image_urls=[
         
     | 
| 568 | 
         
             
                "https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg",
         
     | 
| 
         @@ -578,11 +578,11 @@ print(response) 
     | 
|
| 578 | 
         
             
            There are two ways to do the multi-turn conversations with the pipeline. One is to construct messages according to the format of OpenAI and use above introduced method, the other is to use the `pipeline.chat` interface.
         
     | 
| 579 | 
         | 
| 580 | 
         
             
            ```python
         
     | 
| 581 | 
         
            -
            from lmdeploy import pipeline, TurbomindEngineConfig, GenerationConfig
         
     | 
| 582 | 
         
             
            from lmdeploy.vl import load_image
         
     | 
| 583 | 
         | 
| 584 | 
         
             
            model = 'OpenGVLab/InternVL3-78B'
         
     | 
| 585 | 
         
            -
            pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=16384, tp=4))
         
     | 
| 586 | 
         | 
| 587 | 
         
             
            image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg')
         
     | 
| 588 | 
         
             
            gen_config = GenerationConfig(top_k=40, top_p=0.8, temperature=0.8)
         
     | 
| 
         @@ -597,7 +597,7 @@ print(sess.response.text) 
     | 
|
| 597 | 
         
             
            LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup:
         
     | 
| 598 | 
         | 
| 599 | 
         
             
            ```shell
         
     | 
| 600 | 
         
            -
            lmdeploy serve api_server OpenGVLab/InternVL3-78B --server-port 23333 --tp 4
         
     | 
| 601 | 
         
             
            ```
         
     | 
| 602 | 
         | 
| 603 | 
         
             
            To use the OpenAI-style interface, you need to install OpenAI:
         
     | 
| 
         | 
|
| 518 | 
         
             
            #### A 'Hello, world' Example
         
     | 
| 519 | 
         | 
| 520 | 
         
             
            ```python
         
     | 
| 521 | 
         
            +
            from lmdeploy import pipeline, TurbomindEngineConfig, ChatTemplateConfig
         
     | 
| 522 | 
         
             
            from lmdeploy.vl import load_image
         
     | 
| 523 | 
         | 
| 524 | 
         
             
            model = 'OpenGVLab/InternVL3-78B'
         
     | 
| 525 | 
         
             
            image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
         
     | 
| 526 | 
         
            +
            pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=16384, tp=4), chat_template_config=ChatTemplateConfig(model_name='internvl2_5'))
         
     | 
| 527 | 
         
             
            response = pipe(('describe this image', image))
         
     | 
| 528 | 
         
             
            print(response.text)
         
     | 
| 529 | 
         
             
            ```
         
     | 
| 
         | 
|
| 535 | 
         
             
            When dealing with multiple images, you can put them all in one list. Keep in mind that multiple images will lead to a higher number of input tokens, and as a result, the size of the context window typically needs to be increased.
         
     | 
| 536 | 
         | 
| 537 | 
         
             
            ```python
         
     | 
| 538 | 
         
            +
            from lmdeploy import pipeline, TurbomindEngineConfig, ChatTemplateConfig
         
     | 
| 539 | 
         
             
            from lmdeploy.vl import load_image
         
     | 
| 540 | 
         
             
            from lmdeploy.vl.constants import IMAGE_TOKEN
         
     | 
| 541 | 
         | 
| 542 | 
         
             
            model = 'OpenGVLab/InternVL3-78B'
         
     | 
| 543 | 
         
            +
            pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=16384, tp=4), chat_template_config=ChatTemplateConfig(model_name='internvl2_5'))
         
     | 
| 544 | 
         | 
| 545 | 
         
             
            image_urls=[
         
     | 
| 546 | 
         
             
                'https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg',
         
     | 
| 
         | 
|
| 558 | 
         
             
            Conducting inference with batch prompts is quite straightforward; just place them within a list structure:
         
     | 
| 559 | 
         | 
| 560 | 
         
             
            ```python
         
     | 
| 561 | 
         
            +
            from lmdeploy import pipeline, TurbomindEngineConfig, ChatTemplateConfig
         
     | 
| 562 | 
         
             
            from lmdeploy.vl import load_image
         
     | 
| 563 | 
         | 
| 564 | 
         
             
            model = 'OpenGVLab/InternVL3-78B'
         
     | 
| 565 | 
         
            +
            pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=16384, tp=4), chat_template_config=ChatTemplateConfig(model_name='internvl2_5'))
         
     | 
| 566 | 
         | 
| 567 | 
         
             
            image_urls=[
         
     | 
| 568 | 
         
             
                "https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg",
         
     | 
| 
         | 
|
| 578 | 
         
             
            There are two ways to do the multi-turn conversations with the pipeline. One is to construct messages according to the format of OpenAI and use above introduced method, the other is to use the `pipeline.chat` interface.
         
     | 
| 579 | 
         | 
| 580 | 
         
             
            ```python
         
     | 
| 581 | 
         
            +
            from lmdeploy import pipeline, TurbomindEngineConfig, GenerationConfig, ChatTemplateConfig
         
     | 
| 582 | 
         
             
            from lmdeploy.vl import load_image
         
     | 
| 583 | 
         | 
| 584 | 
         
             
            model = 'OpenGVLab/InternVL3-78B'
         
     | 
| 585 | 
         
            +
            pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=16384, tp=4), chat_template_config=ChatTemplateConfig(model_name='internvl2_5'))
         
     | 
| 586 | 
         | 
| 587 | 
         
             
            image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg')
         
     | 
| 588 | 
         
             
            gen_config = GenerationConfig(top_k=40, top_p=0.8, temperature=0.8)
         
     | 
| 
         | 
|
| 597 | 
         
             
            LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup:
         
     | 
| 598 | 
         | 
| 599 | 
         
             
            ```shell
         
     | 
| 600 | 
         
            +
            lmdeploy serve api_server OpenGVLab/InternVL3-78B --chat-template internvl2_5 --server-port 23333 --tp 4
         
     | 
| 601 | 
         
             
            ```
         
     | 
| 602 | 
         | 
| 603 | 
         
             
            To use the OpenAI-style interface, you need to install OpenAI:
         
     |