RegMean¶
RegMean (Regression Mean) is a dataless knowledge fusion approach that formulates model merging as a linear regression problem1. The algorithm aims to find optimal weights for each linear layer in the merged model by minimizing the discrepancy in predictions between the merge and candidate models.
Algorithm Overview¶
For a transformer layer \(l\), to obtain the merge weights for a linear layer \(W^{(l)}_{M}\), RegMean provides a precise closed-form solution for merging those from \(K\) candidate models:
where:
- \(W^{(l)}_i\) is the weight matrix of the \(i\)-th candidate model at layer \(l\)
- \(X^{(l)}_i\) represents the input activations to layer \(l\) for model \(i\)
- The formula computes a weighted combination that minimizes prediction discrepancy
Examples¶
CLI Usage¶
Configuration templates for RegMean:
_target_: fusion_bench.method.RegMeanAlgorithmForCLIP
# list, regular expression of names of parameters that need to be excluded
exclude_param_names_regex: []
# numbers of examples to compute regmean weights
num_regmean_examples: 256
weight_transpose: true
# float, reduce non-diagonal elements in regmean weights by multiplying this scalar
reduce_non_diagonal_ratio: 0.95
dataloader_kwargs:
batch_size: 32
num_workers: 0
_target_: fusion_bench.method.RegMeanAlgorithmForGPT2
# list, regular expression of names of parameters that need to be excluded
exclude_param_names_regex: []
# numbers of examples to compute regmean weights
num_regmean_examples: 256
# float, reduce non-diagonal elements in regmean weights by multiplying this scalar
reduce_non_diagonal_ratio: 0.6
weight_transpose: false
cache_dir: outputs
batch_size: 32
num_workers: 0
CLIP Models¶
Merge CLIP-ViT-B/32 models on eight image classification tasks:
fusion_bench method=regmean/clip_regmean \
modelpool=CLIPVisionModelPool/clip-vit-base-patch32_TA8 \
taskpool=CLIPVisionModelTaskPool/clip-vit-classification_TA8
Merge CLIP-ViT-L/14 models on eight image classification tasks:
fusion_bench \
method=regmean/clip_regmean \
method.dataloader_kwargs.batch_size=8 \
modelpool=CLIPVisionModelPool/clip-vit-large-patch14_TA8 \
taskpool=CLIPVisionModelTaskPool/clip-vit-classification_TA8 \
taskpool.base_model=openai/clip-vit-large-patch14
Language Models¶
Merge GPT-2 models for text classification tasks:
API Usage¶
To use RegMean programmatically:
CLIP Models¶
from fusion_bench.method.regmean import RegMeanAlgorithmForCLIP
from omegaconf import DictConfig
# Configuration for CLIP RegMean
config = DictConfig({
'exclude_param_names_regex': [],
'num_regmean_examples': 256,
'weight_transpose': True,
'reduce_non_diagonal_ratio': 0.95,
'dataloader_kwargs': {
'batch_size': 32,
'num_workers': 0
}
})
# Initialize the algorithm
algorithm = RegMeanAlgorithmForCLIP(**config)
# Run the algorithm with a model pool
merged_model = algorithm.run(modelpool)
GPT-2 Models¶
from fusion_bench.method.regmean import RegMeanAlgorithmForGPT2
from omegaconf import DictConfig
# Configuration for GPT-2 RegMean
config = DictConfig({
'exclude_param_names_regex': [],
'num_regmean_examples': 256,
'reduce_non_diagonal_ratio': 0.6,
'weight_transpose': False,
'cache_dir': 'outputs',
'batch_size': 32,
'num_workers': 0
})
# Initialize the algorithm
algorithm = RegMeanAlgorithmForGPT2(**config)
# Run the algorithm
merged_model = algorithm.run(modelpool)
Implementation Details¶
-
Xisen Jin, et al. "Dataless Knowledge Fusion by Merging Weights of Language Models." http://arxiv.org/abs/2212.09849 ↩