Large Language Models (Causal LMs)¶
The CausalLMPool
class provides a unified interface for managing and loading causal language models from the Hugging Face Transformers library with flexible configuration options.
Configuration¶
The CausalLMPool
can be configured using YAML files. Here are the main configuration options:
Basic Configuration¶
_target_: fusion_bench.modelpool.CausalLMPool # (1)
models:
_pretrained_: path_to_pretrained_model # (2)
model_a: path_to_model_a
model_b: path_to_model_b
model_kwargs: # (3)
torch_dtype: bfloat16 # or float16, float32, etc.
tokenizer: path_to_tokenizer # (4)
_target_
indicates the modelpool class to be instantiated._pretrained_
,model_a
, andmodel_b
indicates the name of the model to be loaded, if a plain string is given as the value, it will be passed toAutoModelForCausalLM.from_pretrained
to load the model.model_kwargs
is a dictionary of keyword arguments to be passed toAutoModelForCausalLM.from_pretrained
, can be overridden by passing kwargs tomodelpool.load_model
function.tokenizer
indicates the tokenizer to be loaded, if a plain string, it will be passed toAutoTokenizer.from_pretrained
.
Special Model Names in FusionBench
Names starting and ending with "_" are reserved for special purposes in FusionBench.
For example, _pretrained_
is a special model name in FusionBench, it is used to specify the pre-trained model to be loaded and pre-trained model can be loaded by calling modelpool.load_pretrained_model()
or modelpool.load_model("_pretrained_")
.
Basic Usage¶
Information about the Model Pool¶
Get all the model names in the model pool except the special model names:
Check if a pre-trained model is in the model pool:
Get all the model names in the model pool, including the special model names:
Loading and Saving Models and Tokenizers¶
Load a model from the model pool by model name:
Load a model from the model pool and pass/override additional arguments to the model constructor:
Load the pre-trained model from the model pool:
>>> pretrained_model = modelpool.load_pretrained_model()
# or equivalently
>>> pretrained_model = modelpool.load_model("_pretrained_")
Load the pre-trained model or the first model in the model pool:
# if there is a pre-trained model in the model pool, then it will be loaded
# otherwise, the first model in the model pool will be loaded
>>> model = modelpool.load_pretrained_or_first_model()
Load the tokenizer from the model pool:
Save a model with tokenizer:
# Save model with tokenizer
>>> modelpool.save_model(
model=model,
path="path/to/save",
save_tokenizer=True,
push_to_hub=False
)
Advanced Configuration¶
You can also use more detailed configuration with explicit model and tokenizer settings:
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_:
_target_: transformers.AutoModelForCausalLM # (1)
pretrained_model_name_or_path: path_to_pretrained_model
model_a:
_target_: transformers.AutoModelForCausalLM
pretrained_model_name_or_path: path_to_model_a
tokenizer:
_target_: transformers.AutoTokenizer # (2)
pretrained_model_name_or_path: path_to_tokenizer
model_kwargs:
torch_dtype: bfloat16
_target_
indicates the model class to be loaded, if a plain string is given as the value, it will be passed toAutoModelForCausalLM.from_pretrained
to load the model. By setting_target_
, you can use a custom model class or function to load the model. For example, you can useload_peft_causal_lm
to load a PEFT model._target_
indicates the tokenizer class to be loaded, if a plain string is given as the value, it will be passed toAutoTokenizer.from_pretrained
to load the tokenizer. By setting_target_
, you can use a custom tokenizer class or function to load the tokenizer.
Working with PEFT Models¶
from fusion_bench.modelpool.causal_lm import load_peft_causal_lm
# Load a PEFT model
model = load_peft_causal_lm(
base_model_path="path/to/base/model",
peft_model_path="path/to/peft/model",
torch_dtype="bfloat16",
is_trainable=True,
merge_and_unload=False
)
Configuration Examples¶
Single Model Configuration¶
_target_: fusion_bench.modelpool.CausalLMPool
_recursive_: false
# each model should have a name and a path, and the model is loaded from the path
# this is equivalent to `AutoModelForCausalLM.from_pretrained(path)`
models:
_pretrained_:
_target_: transformers.LlamaForCausalLM.from_pretrained
pretrained_model_name_or_path: ${...base_model}
model_kwargs:
torch_dtype: float16
tokenizer:
_target_: transformers.AutoTokenizer.from_pretrained
pretrained_model_name_or_path: ${..base_model}
base_model: decapoda-research/llama-7b-hf
Multiple Models Configuration¶
Here we use models from MergeBench as an example.
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: google/gemma-2-2b
instruction: MergeBench/gemma-2-2b_instruction
math: MergeBench/gemma-2-2b_math
coding: MergeBench/gemma-2-2b_coding
multilingual: MergeBench/gemma-2-2b_multilingual
safety: MergeBench/gemma-2-2b_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: google/gemma-2-2b
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: google/gemma-2-2b-it
instruction: MergeBench/gemma-2-2b-it_instruction
math: MergeBench/gemma-2-2b-it_math
coding: MergeBench/gemma-2-2b-it_coding
multilingual: MergeBench/gemma-2-2b-it_multilingual
safety: MergeBench/gemma-2-2b-it_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: google/gemma-2-2b-it
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: google/gemma-2-9b
instruction: MergeBench/gemma-2-9b_instruction
math: MergeBench/gemma-2-9b_math
coding: MergeBench/gemma-2-9b_coding
multilingual: MergeBench/gemma-2-9b_multilingual
safety: MergeBench/gemma-2-9b_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: google/gemma-2-9b
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: google/gemma-2-9b-it
instruction: MergeBench/gemma-2-9b-it_instruction
math: MergeBench/gemma-2-9b-it_math
coding: MergeBench/gemma-2-9b-it_coding
multilingual: MergeBench/gemma-2-9b-it_multilingual
safety: MergeBench/gemma-2-9b-it_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: google/gemma-2-9b-it
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: meta-llama/Llama-3.1-8B
instruction: MergeBench/Llama-3.1-8B_instruction
math: MergeBench/Llama-3.1-8B_math
coding: MergeBench/Llama-3.1-8B_coding
multilingual: MergeBench/Llama-3.1-8B_multilingual
safety: MergeBench/Llama-3.1-8B_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: meta-llama/Llama-3.1-8B
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: meta-llama/Llama-3.1-8B-Instruct
instruction: MergeBench/Llama-3.1-8B-Instruct_instruction
math: MergeBench/Llama-3.1-8B-Instruct_math
coding: MergeBench/Llama-3.1-8B-Instruct_coding
multilingual: MergeBench/Llama-3.1-8B-Instruct_multilingual
safety: MergeBench/Llama-3.1-8B-Instruct_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: meta-llama/Llama-3.1-8B-Instruct
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: meta-llama/Llama-3.2-3B
instruction: MergeBench/Llama-3.2-3B_instruction
math: MergeBench/Llama-3.2-3B_math
coding: MergeBench/Llama-3.2-3B_coding
multilingual: MergeBench/Llama-3.2-3B_multilingual
safety: MergeBench/Llama-3.2-3B_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: meta-llama/Llama-3.2-3B
_target_: fusion_bench.modelpool.CausalLMPool
models:
_pretrained_: meta-llama/Llama-3.2-3B-Instruct
instruction: MergeBench/Llama-3.2-3B-Instruct_instruction
math: MergeBench/Llama-3.2-3B-Instruct_math
coding: MergeBench/Llama-3.2-3B-Instruct_coding
multilingual: MergeBench/Llama-3.2-3B-Instruct_multilingual
safety: MergeBench/Llama-3.2-3B-Instruct_safety
model_kwargs:
torch_dtype: bfloat16
tokenizer: meta-llama/Llama-3.2-3B-Instruct
Merge Large Language Models with FusionBench¶
Merge gemma-2b models with simple average:
Merge gemma-2b models with Task Arithmetic:
Merge Llama-3.1-8B models with Ties-Merging:
Merge Llama-3.1-8B-Instruct models with Dare-Ties, with 70% sparsity:
fusion_bench method=dare/ties_merging method.sparsity_ratio=0.7 modelpool=CausalLMPool/mergebench/Llama-3.1-8B-Instruct
Special Features¶
CausalLMBackbonePool¶
The CausalLMBackbonePool
is a specialized version of CausalLMPool
that returns only the transformer layers of the model. This is useful when you need to work with the model's backbone architecture directly.
from fusion_bench.modelpool import CausalLMBackbonePool
backbone_pool = CausalLMBackbonePool.from_config(config)
layers = backbone_pool.load_model("model_a") # Returns model.layers
References¶
CausalLMPool
¶
Bases: BaseModelPool
Source code in fusion_bench/modelpool/causal_lm/causal_lm.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
|
load_model(model_name_or_config, *args, **kwargs)
¶
Example of YAML config:
models:
_pretrained_: path_to_pretrained_model # if a plain string, it will be passed to AutoModelForCausalLM.from_pretrained
model_a: path_to_model_a
model_b: path_to_model_b
or equivalently,
models:
_pretrained_:
_target_: transformers.AutoModelForCausalLM # any callable that returns a model
pretrained_model_name_or_path: path_to_pretrained_model
model_a:
_target_: transformers.AutoModelForCausalLM
pretrained_model_name_or_path: path_to_model_a
model_b:
_target_: transformers.AutoModelForCausalLM
pretrained_model_name_or_path: path_to_model_b
Source code in fusion_bench/modelpool/causal_lm/causal_lm.py
load_tokenizer(*args, **kwargs)
¶
Example of YAML config:
tokenizer: google/gemma-2-2b-it # if a plain string, it will be passed to AutoTokenizer.from_pretrained
or equivalently,
tokenizer:
_target_: transformers.AutoTokenizer # any callable that returns a tokenizer
pretrained_model_name_or_path: google/gemma-2-2b-it
Returns:
-
PreTrainedTokenizer
(PreTrainedTokenizer
) –The tokenizer.
Source code in fusion_bench/modelpool/causal_lm/causal_lm.py
save_model(model, path, push_to_hub=False, model_dtype=None, save_tokenizer=False, tokenizer_kwargs=None, **kwargs)
¶
Save the model to the specified path.
Parameters:
-
model
¶PreTrainedModel
) –The model to be saved.
-
path
¶str
) –The path where the model will be saved.
-
push_to_hub
¶bool
, default:False
) –Whether to push the model to the Hugging Face Hub. Defaults to False.
-
save_tokenizer
¶bool
, default:False
) –Whether to save the tokenizer along with the model. Defaults to False.
-
**kwargs
¶Additional keyword arguments passed to the
save_pretrained
method.