Large Language Models (Causal LMs)¶
The CausalLMPool
class provides a unified interface for managing and loading causal language models from the Hugging Face Transformers library with flexible configuration options.
Configuration¶
The CausalLMPool
can be configured using YAML files. Here are the main configuration options:
Basic Configuration¶
_target_: fusion_bench.modelpool.CausalLMPool # (1)
models:
_pretrained_: path_to_pretrained_model # (2)
model_a: path_to_model_a
model_b: path_to_model_b
model_kwargs: # (3)
torch_dtype: bfloat16 # or float16, float32, etc.
tokenizer: path_to_tokenizer # (4)
_target_
indicates the modelpool class to be instantiated._pretrained_
,model_a
, andmodel_b
indicates the name of the model to be loaded, if a plain string is given as the value, it will be passed toAutoModelForCausalLM.from_pretrained
to load the model.model_kwargs
is a dictionary of keyword arguments to be passed toAutoModelForCausalLM.from_pretrained
, can be overridden by passing kwargs tomodelpool.load_model
function.tokenizer
indicates the tokenizer to be loaded, if a plain string, it will be passed toAutoTokenizer.from_pretrained
.
Special Model Names in FusionBench
Names starting and ending with "_" are reserved for special purposes in FusionBench.
For example, _pretrained_
is a special model name in FusionBench, it is used to specify the pre-trained model to be loaded and pre-trained model can be loaded by calling modelpool.load_pretrained_model()
or modelpool.load_model("_pretrained_")
.
Basic Usage¶
Information about the Model Pool¶
Get all the model names in the model pool except the special model names:
Check if a pre-trained model is in the model pool:
Get all the model names in the model pool, including the special model names:
Loading and Saving Models and Tokenizers¶
Load a model from the model pool by model name:
Load a model from the model pool and pass/override additional arguments to the model constructor:
Load the pre-trained model from the model pool:
>>> pretrained_model = modelpool.load_pretrained_model()
# or equivalently
>>> pretrained_model = modelpool.load_model("_pretrained_")
Load the pre-trained model or the first model in the model pool:
# if there is a pre-trained model in the model pool, then it will be loaded
# otherwise, the first model in the model pool will be loaded
>>> model = modelpool.load_pretrained_or_first_model()
Load the tokenizer from the model pool:
Save a model with tokenizer:
# Save model with tokenizer
>>> modelpool.save_model(
model=model,
path="path/to/save",
save_tokenizer=True,
push_to_hub=False
)
Advanced Configuration¶
You can also use more detailed configuration with explicit model and tokenizer settings:
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_:
_target_: transformers.AutoModelForCausalLM # (1)
pretrained_model_name_or_path: path_to_pretrained_model
model_a:
_target_: transformers.AutoModelForCausalLM
pretrained_model_name_or_path: path_to_model_a
tokenizer:
_target_: transformers.AutoTokenizer # (2)
pretrained_model_name_or_path: path_to_tokenizer
model_kwargs:
torch_dtype: bfloat16
_target_
indicates the model class to be loaded, if a plain string is given as the value, it will be passed toAutoModelForCausalLM.from_pretrained
to load the model. By setting_target_
, you can use a custom model class or function to load the model. For example, you can useload_peft_causal_lm
to load a PEFT model._target_
indicates the tokenizer class to be loaded, if a plain string is given as the value, it will be passed toAutoTokenizer.from_pretrained
to load the tokenizer. By setting_target_
, you can use a custom tokenizer class or function to load the tokenizer.
Working with PEFT Models¶
from fusion_bench.modelpool.causal_lm import load_peft_causal_lm
# Load a PEFT model
model = load_peft_causal_lm(
base_model_path="path/to/base/model",
peft_model_path="path/to/peft/model",
torch_dtype="bfloat16",
is_trainable=True,
merge_and_unload=False
)
Configuration Examples¶
Single Model Configuration¶
_target_: fusion_bench.modelpool.CausalLMPool
_recursive_: false
# each model should have a name and a path, and the model is loaded from the path
# this is equivalent to `AutoModelForCausalLM.from_pretrained(path)`
models:
_pretrained_:
_target_: transformers.LlamaForCausalLM.from_pretrained
pretrained_model_name_or_path: ${...base_model}
model_kwargs:
torch_dtype: float16
tokenizer:
_target_: transformers.AutoTokenizer.from_pretrained
pretrained_model_name_or_path: ${..base_model}
base_model: decapoda-research/llama-7b-hf
Multiple Models Configuration¶
Here we use models from MergeBench as an example.
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: google/gemma-2-2b
instruction: MergeBench/gemma-2-2b_instruction
math: MergeBench/gemma-2-2b_math
coding: MergeBench/gemma-2-2b_coding
multilingual: MergeBench/gemma-2-2b_multilingual
safety: MergeBench/gemma-2-2b_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: google/gemma-2-2b
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: google/gemma-2-2b-it
instruction: MergeBench/gemma-2-2b-it_instruction
math: MergeBench/gemma-2-2b-it_math
coding: MergeBench/gemma-2-2b-it_coding
multilingual: MergeBench/gemma-2-2b-it_multilingual
safety: MergeBench/gemma-2-2b-it_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: google/gemma-2-2b-it
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: google/gemma-2-9b
instruction: MergeBench/gemma-2-9b_instruction
math: MergeBench/gemma-2-9b_math
coding: MergeBench/gemma-2-9b_coding
multilingual: MergeBench/gemma-2-9b_multilingual
safety: MergeBench/gemma-2-9b_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: google/gemma-2-9b
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: google/gemma-2-9b-it
instruction: MergeBench/gemma-2-9b-it_instruction
math: MergeBench/gemma-2-9b-it_math
coding: MergeBench/gemma-2-9b-it_coding
multilingual: MergeBench/gemma-2-9b-it_multilingual
safety: MergeBench/gemma-2-9b-it_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: google/gemma-2-9b-it
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: meta-llama/Llama-3.1-8B
instruction: MergeBench/Llama-3.1-8B_instruction
math: MergeBench/Llama-3.1-8B_math
coding: MergeBench/Llama-3.1-8B_coding
multilingual: MergeBench/Llama-3.1-8B_multilingual
safety: MergeBench/Llama-3.1-8B_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: meta-llama/Llama-3.1-8B
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: meta-llama/Llama-3.1-8B-Instruct
instruction: MergeBench/Llama-3.1-8B-Instruct_instruction
math: MergeBench/Llama-3.1-8B-Instruct_math
coding: MergeBench/Llama-3.1-8B-Instruct_coding
multilingual: MergeBench/Llama-3.1-8B-Instruct_multilingual
safety: MergeBench/Llama-3.1-8B-Instruct_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: meta-llama/Llama-3.1-8B-Instruct
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: meta-llama/Llama-3.2-3B
instruction: MergeBench/Llama-3.2-3B_instruction
math: MergeBench/Llama-3.2-3B_math
coding: MergeBench/Llama-3.2-3B_coding
multilingual: MergeBench/Llama-3.2-3B_multilingual
safety: MergeBench/Llama-3.2-3B_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: meta-llama/Llama-3.2-3B
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: meta-llama/Llama-3.2-3B-Instruct
instruction: MergeBench/Llama-3.2-3B-Instruct_instruction
math: MergeBench/Llama-3.2-3B-Instruct_math
coding: MergeBench/Llama-3.2-3B-Instruct_coding
multilingual: MergeBench/Llama-3.2-3B-Instruct_multilingual
safety: MergeBench/Llama-3.2-3B-Instruct_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: meta-llama/Llama-3.2-3B-Instruct
Merge Large Language Models with FusionBench¶
Merge gemma-2b models with simple average:
Merge gemma-2b models with Task Arithmetic:
Merge Llama-3.1-8B models with Ties-Merging:
Merge Llama-3.1-8B-Instruct models with Dare-Ties, with 70% sparsity:
fusion_bench method=dare/ties_merging method.sparsity_ratio=0.7 modelpool=CausalLMPool/mergebench/Llama-3.1-8B-Instruct
Special Features¶
CausalLMBackbonePool¶
The CausalLMBackbonePool
is a specialized version of CausalLMPool
that returns only the transformer layers of the model. This is useful when you need to work with the model's backbone architecture directly.
from fusion_bench.modelpool import CausalLMBackbonePool
backbone_pool = CausalLMBackbonePool.from_config(config)
layers = backbone_pool.load_model("model_a") # Returns model.layers