Gemma-2¶
Gemma-2-2B Models¶
This configuration includes the base model and specialized fine-tuned variants from MergeBench:
config/modelpool/CausalLMPool/mergebench/gemma-2-2b.yaml
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: google/gemma-2-2b
instruction: MergeBench/gemma-2-2b_instruction
math: MergeBench/gemma-2-2b_math
coding: MergeBench/gemma-2-2b_coding
multilingual: MergeBench/gemma-2-2b_multilingual
safety: MergeBench/gemma-2-2b_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: google/gemma-2-2b
This configuration focuses on instruction-tuned variants:
config/modelpool/CausalLMPool/mergebench/gemma-2-2b-it.yaml
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: google/gemma-2-2b-it
instruction: MergeBench/gemma-2-2b-it_instruction
math: MergeBench/gemma-2-2b-it_math
coding: MergeBench/gemma-2-2b-it_coding
multilingual: MergeBench/gemma-2-2b-it_multilingual
safety: MergeBench/gemma-2-2b-it_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: google/gemma-2-2b-it
Model Fusion Experiments¶
Simple Average¶
fusion_bench path.log_dir=outputs/gemma-2-2b/simple_average \
method=linear/simple_average_for_causallm \
modelpool=CausalLMPool/mergebench/gemma-2-2b
Gemma-2-9B Models¶
This configuration includes the base model and specialized fine-tuned variants from MergeBench:
config/modelpool/CausalLMPool/mergebench/gemma-2-9b.yaml
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: google/gemma-2-9b
instruction: MergeBench/gemma-2-9b_instruction
math: MergeBench/gemma-2-9b_math
coding: MergeBench/gemma-2-9b_coding
multilingual: MergeBench/gemma-2-9b_multilingual
safety: MergeBench/gemma-2-9b_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: google/gemma-2-9b
This configuration focuses on instruction-tuned variants:
config/modelpool/CausalLMPool/mergebench/gemma-2-9b-it.yaml
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: google/gemma-2-9b-it
instruction: MergeBench/gemma-2-9b-it_instruction
math: MergeBench/gemma-2-9b-it_math
coding: MergeBench/gemma-2-9b-it_coding
multilingual: MergeBench/gemma-2-9b-it_multilingual
safety: MergeBench/gemma-2-9b-it_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: google/gemma-2-9b-it