Skip to content

CLIP Task Arithmetic

This tutorial demonstrates how to merge CLIP (Contrastive Language-Image Pre-training) models using the Task Arithmetic algorithm 1 - a powerful model fusion technique that combines multiple task-specific models by manipulating their "task vectors" with configurable scaling factors.

Task Arithmetic is an advanced model fusion technique that operates on the concept of task vectors - the directional differences between a fine-tuned model and its pretrained base model. This approach provides more fine-grained control over the fusion process compared to simple averaging.

Mathematically, Task Arithmetic can be expressed as:

Step 1: Compute Task Vectors

\[ \tau_i = \theta_i - \theta_0 \]

Step 2: Scale and Combine Task Vectors

\[ \theta_{merged} = \theta_0 + \lambda \sum_{i=1}^{N} \tau_i \]

where:

  • \( \theta_{merged} \) is the final merged model parameters
  • \( \theta_0 \) is the pretrained base model parameters
  • \( \theta_i \) are the fine-tuned model parameters
  • \( \tau_i \) are the task vectors (learned adaptations)
  • \( \lambda \) is the scaling factor that controls the strength of task vector influence
  • \( N \) is the number of task-specific models

πŸ”§ Standalone YAML Configuration

The example uses the following configuration that demonstrates merging CLIP models with task arithmetic on image classification datasets:

config/_get_started/clip_task_arithmetic.yaml
_target_: fusion_bench.programs.FabricModelFusionProgram
_recursive_: false
method:
  _target_: fusion_bench.method.TaskArithmeticAlgorithm
  scaling_factor: 0.7
modelpool:
  _target_: fusion_bench.modelpool.CLIPVisionModelPool
  models:
    _pretrained_: openai/clip-vit-base-patch32
    sun397: tanganke/clip-vit-base-patch32_sun397
    stanford-cars: tanganke/clip-vit-base-patch32_stanford-cars
taskpool:
  _target_: fusion_bench.taskpool.CLIPVisionModelTaskPool
  test_datasets:
    sun397:
      _target_: datasets.load_dataset
      path: tanganke/sun397
      split: test
    stanford-cars:
      _target_: datasets.load_dataset
      path: tanganke/stanford_cars
      split: test
  clip_model: openai/clip-vit-base-patch32
  processor: openai/clip-vit-base-patch32
  1. Program Configuration: Specifies FabricModelFusionProgram to handle the fusion workflow
  2. Method Configuration: Uses TaskArithmeticAlgorithm with a scaling factor, whose default value is set as 0.7. The option names in the configuration file are the same as those in the code.

    TaskArithmeticAlgorithm.__init__()

    Initializes the TaskArithmeticAlgorithm with the given scaling factor.

    Parameters:

    • scaling_factor (int) –

      The factor by which the task vectors will be scaled before merging.

    Source code in fusion_bench/method/task_arithmetic/task_arithmetic.py
    def __init__(self, scaling_factor: int, **kwargs):
        """
        Initializes the TaskArithmeticAlgorithm with the given scaling factor.
    
        Args:
            scaling_factor (int): The factor by which the task vectors will be scaled before merging.
        """
        super().__init__(**kwargs)
    
  3. Model Pool: Contains the base pretrained model and fine-tuned variants

  4. Task Pool: Defines evaluation datasets for performance assessment

πŸš€ Running the Example

Execute the task arithmetic fusion with the following command:

fusion_bench --config-path $PWD/config/_get_started --config-name clip_task_arithmetic

Hyperparameter Tuning

You can experiment with different scaling factors by overriding the configuration:

# More conservative fusion (less task-specific influence)
fusion_bench --config-path $PWD/config/_get_started --config-name clip_task_arithmetic \
    method.scale_factor=0.5

# More aggressive fusion (stronger task-specific influence)  
fusion_bench --config-path $PWD/config/_get_started --config-name clip_task_arithmetic \
    method.scale_factor=1.0

πŸ› Debugging Configuration (VS Code)

.vscode/launch.json
{
    "name": "clip_task_arithmetic",
    "type": "debugpy",
    "request": "launch",
    "module": "fusion_bench.scripts.cli",
    "args": [
        "--config-path",
        "${workspaceFolder}/config/_get_started",
        "--config-name",
        "clip_task_arithmetic"
    ],
    "console": "integratedTerminal",
    "justMyCode": true,
    "env": {
        "HYDRA_FULL_ERROR": "1"
    }
}

  1. G. Ilharco et al., β€œEditing Models with Task Arithmetic,” Mar. 31, 2023, arXiv: arXiv:2212.04089. doi: 10.48550/arXiv.2212.04089.