SLERP merge example code?

#20
by grimjim - opened

Is example code for a SLERP merge available from Google? The current version of mergekit has a defect which results in layer duplicating, turning a 9B model into 10B.

Google org

Hi @grimjim ,

As of now, Google doesn't provide direct public examples or documentation specifically on SLERP (Spherical Linear Interpolation) for model merging in relation to models like MergeKit.
If MergeKit is causing issues, you can manually merge models using a controlled method, either by simple weighted averaging or SLERP interpolation. Kindly use this reference link for more information.

Thank you.

Google org
This comment has been hidden

Thanks. The mergekit defect has since been addressed, and I now have a Python script to remediate impacted merges.
https://github.com/jim-plus/Gemma2-mergekit-remediation

grimjim changed discussion status to closed

Sign up or log in to comment