Utility Classes¶
Debugging Purpose¶
- DummyAlgorithm: A dummy algorithm for testing purposes.
DummyAlgorithm
¶
Bases: BaseAlgorithm
Source code in fusion_bench/method/dummy.py
run(modelpool)
¶
This method returns the pretrained model from the model pool. If the pretrained model is not available, it returns the first model from the model pool.
Parameters:
-
modelpool
(BaseModelPool
) –The pool of models to fuse.
Raises:
-
AssertionError
–If the model is not found in the model pool.
Source code in fusion_bench/method/dummy.py
Analysis Purpose¶
- TaskVectorCosSimilarity: Computes the cosine similarity between task vectors.
- TaskVectorViolinPlot: Generates a violin plot for task vector distributions.
TaskVectorCosSimilarity
¶
Bases: LightningFabricMixin
, BaseAlgorithm
Computes and analyzes cosine similarity between task vectors of models in a model pool.
This algorithm extracts task vectors from fine-tuned models by computing the difference between their parameters and a pretrained base model. It then calculates the pairwise cosine similarity between all task vectors to understand the relationships and overlap between different tasks.
The task vector for a model is defined as
task_vector = finetuned_model_params - pretrained_model_params
Parameters:
-
plot_heatmap
(bool
) –Whether to generate and save a heatmap visualization
-
trainable_only
(bool
, default:True
) –If True, only consider trainable parameters when computing task vectors. Defaults to True.
-
max_points_per_model
(int
, default:None
) –Maximum number of parameters to sample per model for memory efficiency. If None, uses all parameters.
-
output_path
(str
, default:None
) –Directory to save outputs. If None, uses the fabric logger directory.
Outputs
- task_vector_cos_similarity.csv: Pairwise cosine similarity matrix
- task_vector_cos_similarity.pdf: Heatmap visualization (if plot_heatmap=True)
Returns:
-
–
The pretrained model from the model pool.
Example
Source code in fusion_bench/method/analysis/task_vector_cos_similarity.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 |
|
get_state_dict(model)
¶
Extract the state dictionary from a model.
Parameters:
-
model
(Module
) –The model to extract parameters from
Returns:
-
–
Dict[str, torch.Tensor]: State dictionary containing model parameters. Returns only trainable parameters if trainable_only=True, otherwise returns all parameters.
Source code in fusion_bench/method/analysis/task_vector_cos_similarity.py
get_task_vector(pretrained_model, finetuned_model)
¶
Compute the task vector for a fine-tuned model.
The task vector represents the parameter changes from pretraining to fine-tuning and is computed as: task_vector = finetuned_params - pretrained_params
Parameters:
-
pretrained_model
(Module
) –The base pretrained model
-
finetuned_model
(Module
) –The fine-tuned model for a specific task
Returns:
-
Tensor
–torch.Tensor: Flattened task vector containing parameter differences. If max_points_per_model is set, the vector may be downsampled.
Note
- Converts parameters to float64 for numerical precision
- Supports optional downsampling for memory efficiency
- Uses only trainable parameters if trainable_only=True
Source code in fusion_bench/method/analysis/task_vector_cos_similarity.py
run(modelpool)
¶
Execute the task vector cosine similarity analysis.
This method: 1. Loads the pretrained base model from the model pool 2. Computes task vectors for each fine-tuned model 3. Calculates pairwise cosine similarities between all task vectors 4. Saves the similarity matrix as a CSV file 5. Optionally generates and saves a heatmap visualization
Parameters:
-
modelpool
(BaseModelPool
) –Pool containing pretrained and fine-tuned models
Returns:
-
–
nn.Module: The pretrained model from the model pool
Source code in fusion_bench/method/analysis/task_vector_cos_similarity.py
TaskVectorViolinPlot
¶
Bases: LightningFabricMixin
, SimpleProfilerMixin
, BaseAlgorithm
Creates violin plots to visualize the distribution of task vector values across models.
This class implements the task vector visualization technique described in: "Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging" by L. Shen, A. Tang, E. Yang et al. (https://arxiv.org/abs/2410.21804)
Task vectors represent the parameter differences between fine-tuned models and their pretrained base model, computed as: task_vector = finetuned_params - pretrained_params
The algorithm generates two types of violin plots: 1. Distribution of raw task vector values (positive and negative) 2. Distribution of absolute task vector values
Parameters:
-
trainable_only
(bool
) –If True, only consider trainable parameters when computing task vectors. If False, use all parameters.
-
max_points_per_model
(int
, default:1000
) –Maximum number of parameters to sample per model for memory efficiency. If None or 0, uses all parameters. Defaults to 1000.
-
fig_kwargs
(dict
) –Dictionary of keyword arguments to pass to matplotlib.pyplot.subplots. Common options include: - figsize: Tuple of (width, height) in inches - dpi: Dots per inch for resolution - facecolor: Figure background color Defaults to None.
-
output_path
(str
, default:None
) –Directory to save the violin plots. If None, uses the fabric logger's log directory. Defaults to None.
Outputs
- task_vector_violin.pdf: Violin plot of raw task vector value distributions
- task_vector_violin_abs.pdf: Violin plot of absolute task vector value distributions
Returns:
-
–
The pretrained model from the model pool.
Example
Note
This visualization is particularly useful for understanding: - How different tasks affect model parameters - The magnitude and distribution of parameter changes - Similarities and differences between task adaptations
Source code in fusion_bench/method/analysis/task_vector_violin_plot.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 |
|
__init__(trainable_only, max_points_per_model=1000, fig_kwawrgs=None, output_path=None, **kwargs)
¶
Initialize the TaskVectorViolinPlot analyzer.
Parameters:
-
trainable_only
(bool
) –Whether to consider only trainable parameters when computing task vectors. Set to True to focus on learnable parameters, False to include all parameters including frozen ones.
-
max_points_per_model
(int
, default:1000
) –Maximum number of parameter values to sample per model for visualization. Useful for large models to manage memory usage and plot clarity. Set to None or 0 to use all parameters. Defaults to 1000.
-
fig_kwargs
(dict
) –Keyword arguments passed to matplotlib's subplots function for plot customization. Examples: - {'figsize': (10, 6)} for plot dimensions - {'dpi': 300} for high resolution - {'facecolor': 'white'} for background color Defaults to None (uses matplotlib defaults).
-
output_path
(str
, default:None
) –Directory path where violin plots will be saved. If None, uses the fabric logger's log directory. The directory will be created if it doesn't exist. Defaults to None.
-
**kwargs
–Additional keyword arguments passed to parent classes.
Note
The parameter name 'fig_kwawrgs' appears to be a typo for 'fig_kwargs'. This should be corrected in the parameter name for consistency.
Source code in fusion_bench/method/analysis/task_vector_violin_plot.py
get_state_dict(model)
¶
Extract the state dictionary from a model based on parameter filtering settings.
Parameters:
-
model
(Module
) –The PyTorch model to extract parameters from
Returns:
-
–
Dict[str, torch.Tensor]: State dictionary containing model parameters. If trainable_only=True, returns only parameters with requires_grad=True. If trainable_only=False, returns all parameters including frozen ones.
Note
This method respects the trainable_only configuration to focus analysis on either learnable parameters or the complete parameter set depending on the research question being addressed.
Source code in fusion_bench/method/analysis/task_vector_violin_plot.py
get_task_vector(pretrained_model, finetuned_model)
¶
Compute the task vector representing parameter changes from pretraining to fine-tuning.
The task vector quantifies how model parameters have changed during task-specific fine-tuning and is computed as: task_vector = finetuned_params - pretrained_params
Parameters:
-
pretrained_model
(Module
) –The base pretrained model
-
finetuned_model
(Module
) –The fine-tuned model for a specific task
Returns:
-
–
np.ndarray: Flattened numpy array containing parameter differences. If max_points_per_model is set, the array may be randomly downsampled for memory efficiency and visualization clarity.
Processing Steps
- Extract state dictionaries from both models
- Compute parameter differences (subtraction)
- Flatten to 1D vector
- Convert to numpy array with float32 precision
- Optionally downsample if max_points_per_model is specified
Note
- Uses only trainable parameters if trainable_only=True
- Downsampling uses random sampling without replacement
- Preserves the relative distribution of parameter changes
Source code in fusion_bench/method/analysis/task_vector_violin_plot.py
run(modelpool)
¶
Execute the task vector violin plot analysis and visualization.
This method implements the core algorithm that: 1. Loads the pretrained base model from the model pool 2. Computes task vectors for each fine-tuned model (parameter differences) 3. Creates two violin plots showing the distribution of task vector values: - Raw values plot: Shows positive and negative parameter changes - Absolute values plot: Shows magnitude of parameter changes 4. Saves both plots as PDF files in the output directory
The visualization technique follows the approach described in: "Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging"
Parameters:
-
modelpool
(BaseModelPool
) –Pool containing both a pretrained model and fine-tuned models. Must have
has_pretrained=True
.
Returns:
-
–
nn.Module: The pretrained model loaded from the model pool.
Raises:
-
AssertionError
–If the model pool doesn't contain a pretrained model.
Side Effects
- Creates output directory if it doesn't exist
- Saves 'task_vector_violin.pdf' (raw values distribution)
- Saves 'task_vector_violin_abs.pdf' (absolute values distribution)
- Prints progress information during task vector computation
Example Output Files
- task_vector_violin.pdf: Shows how parameters change (+ and -)
- task_vector_violin_abs.pdf: Shows magnitude of parameter changes
Source code in fusion_bench/method/analysis/task_vector_violin_plot.py
136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 |
|