OpenGVLab/InternViT-300M-448px-V2_5 · Shouldn't InternViT-300M-448px-V2_5 be the same as the vision model of InternVL2.5?

In the paper of InternVL2.5 the following is mentioned:

"""
In this report, we further refined the InternViT-300M by incrementally pre-training the previous weights on a more diverse data mixture using the NTP loss, leading to the enhanced InternViT-300M-448px-V2.5.
"""

Doesn't this mean that InternViT-300M-448px-V2_5 should be the same as taking the vision encoder of InternVL2.5? I check all the vision encoders of InternVL2.5-1/2/4/8B and non shares the same parameters as InternViT-300M-448px-V2_5. Could you please clarify what are the differences?