CLIP Task Arithmetic¶
This tutorial demonstrates how to merge CLIP (Contrastive Language-Image Pre-training) models using the Task Arithmetic algorithm 1 - a powerful model fusion technique that combines multiple task-specific models by manipulating their "task vectors" with configurable scaling factors.
Task Arithmetic is an advanced model fusion technique that operates on the concept of task vectors - the directional differences between a fine-tuned model and its pretrained base model. This approach provides more fine-grained control over the fusion process compared to simple averaging.
Mathematically, Task Arithmetic can be expressed as:
Step 1: Compute Task Vectors
Step 2: Scale and Combine Task Vectors
where:
- \( \theta_{merged} \) is the final merged model parameters
- \( \theta_0 \) is the pretrained base model parameters
- \( \theta_i \) are the fine-tuned model parameters
- \( \tau_i \) are the task vectors (learned adaptations)
- \( \lambda \) is the scaling factor that controls the strength of task vector influence
- \( N \) is the number of task-specific models
π§ Standalone YAML Configuration¶
The example uses the following configuration that demonstrates merging CLIP models with task arithmetic on image classification datasets:
- Program Configuration: Specifies
FabricModelFusionProgram
to handle the fusion workflow -
Method Configuration: Uses
TaskArithmeticAlgorithm
with a scaling factor, whose default value is set as 0.7. The option names in the configuration file are the same as those in the code.TaskArithmeticAlgorithm.__init__()
Initializes the TaskArithmeticAlgorithm with the given scaling factor.
Parameters:
-
scaling_factor
(int
) βThe factor by which the task vectors will be scaled before merging.
Source code in
fusion_bench/method/task_arithmetic/task_arithmetic.py
-
-
Model Pool: Contains the base pretrained model and fine-tuned variants
- Task Pool: Defines evaluation datasets for performance assessment
π Running the Example¶
Execute the task arithmetic fusion with the following command:
Hyperparameter Tuning¶
You can experiment with different scaling factors by overriding the configuration:
# More conservative fusion (less task-specific influence)
fusion_bench --config-path $PWD/config/_get_started --config-name clip_task_arithmetic \
method.scale_factor=0.5
# More aggressive fusion (stronger task-specific influence)
fusion_bench --config-path $PWD/config/_get_started --config-name clip_task_arithmetic \
method.scale_factor=1.0
π Debugging Configuration (VS Code)¶
{
"name": "clip_task_arithmetic",
"type": "debugpy",
"request": "launch",
"module": "fusion_bench.scripts.cli",
"args": [
"--config-path",
"${workspaceFolder}/config/_get_started",
"--config-name",
"clip_task_arithmetic"
],
"console": "integratedTerminal",
"justMyCode": true,
"env": {
"HYDRA_FULL_ERROR": "1"
}
}
-
G. Ilharco et al., βEditing Models with Task Arithmetic,β Mar. 31, 2023, arXiv: arXiv:2212.04089. doi: 10.48550/arXiv.2212.04089. ↩