Gemma-2¶

Gemma-2-2B Models¶

This configuration includes the base model and specialized fine-tuned variants from MergeBench:

config/modelpool/CausalLMPool/mergebench/gemma-2-2b.yaml

_target_: fusion_bench.modelpool.CausalLMPool
models:
  _pretrained_: google/gemma-2-2b
  instruction: MergeBench/gemma-2-2b_instruction
  math: MergeBench/gemma-2-2b_math
  coding: MergeBench/gemma-2-2b_coding
  multilingual: MergeBench/gemma-2-2b_multilingual
  safety: MergeBench/gemma-2-2b_safety
model_kwargs:
  torch_dtype: bfloat16
tokenizer: google/gemma-2-2b

This configuration focuses on instruction-tuned variants:

config/modelpool/CausalLMPool/mergebench/gemma-2-2b-it.yaml

_target_: fusion_bench.modelpool.CausalLMPool
models:
  _pretrained_: google/gemma-2-2b-it
  instruction: MergeBench/gemma-2-2b-it_instruction
  math: MergeBench/gemma-2-2b-it_math
  coding: MergeBench/gemma-2-2b-it_coding
  multilingual: MergeBench/gemma-2-2b-it_multilingual
  safety: MergeBench/gemma-2-2b-it_safety
model_kwargs:
  torch_dtype: bfloat16
tokenizer: google/gemma-2-2b-it

Model Fusion Experiments¶

Simple Average¶

fusion_bench path.log_dir=outputs/gemma-2-2b/simple_average \
    method=linear/simple_average_for_causallm \
    modelpool=CausalLMPool/mergebench/gemma-2-2b

Gemma-2-9B Models¶

This configuration includes the base model and specialized fine-tuned variants from MergeBench:

config/modelpool/CausalLMPool/mergebench/gemma-2-9b.yaml

_target_: fusion_bench.modelpool.CausalLMPool
models:
  _pretrained_: google/gemma-2-9b
  instruction: MergeBench/gemma-2-9b_instruction
  math: MergeBench/gemma-2-9b_math
  coding: MergeBench/gemma-2-9b_coding
  multilingual: MergeBench/gemma-2-9b_multilingual
  safety: MergeBench/gemma-2-9b_safety
model_kwargs:
  torch_dtype: bfloat16
tokenizer: google/gemma-2-9b

This configuration focuses on instruction-tuned variants:

config/modelpool/CausalLMPool/mergebench/gemma-2-9b-it.yaml

_target_: fusion_bench.modelpool.CausalLMPool
models:
  _pretrained_: google/gemma-2-9b-it
  instruction: MergeBench/gemma-2-9b-it_instruction
  math: MergeBench/gemma-2-9b-it_math
  coding: MergeBench/gemma-2-9b-it_coding
  multilingual: MergeBench/gemma-2-9b-it_multilingual
  safety: MergeBench/gemma-2-9b-it_safety
model_kwargs:
  torch_dtype: bfloat16
tokenizer: google/gemma-2-9b-it