google/gemma-3-4b-it · Not enough parameters in 4b config.json

Hello there!

We are adding Gemma3 to the torchtune: https://github.com/pytorch/torchtune/pull/2485

Unfortunately, it seems to me that there are not enough parameters in config.json the 4b model.

For instance, let's compare 4B "text_config" and 12B "text_config"

4B:


"text_config": {

    "hidden_size": 2560,

    "intermediate_size": 10240,

    "model_type": "gemma3_text",

    "num_hidden_layers": 34,

    "rope_scaling": {

      "factor": 8.0,

      "rope_type": "linear"

    },

    "sliding_window": 1024

  },

12B


"text_config": {

    "hidden_size": 3840,

    "intermediate_size": 15360,

    "model_type": "gemma3_text",

    "num_attention_heads": 16,

    "num_hidden_layers": 48,

    "num_key_value_heads": 8,

    "rope_scaling": {

      "factor": 8.0,

      "rope_type": "linear"

    },

    "sliding_window": 1024

  },

If it is a desired config for some reason, please let me know. Unfortunately, it makes the integration less clean, as this information is required on the converting stage.

Thanks!