Parallel CLIP Ensemble¶
This tutorial demonstrates how to create and evaluate a parallel ensemble of CLIP (Contrastive Language-Image Pre-training) models using device mapping for efficient multi-GPU inference. Unlike model fusion techniques that merge parameters, ensemble methods maintain separate models and aggregate their predictions at inference time.
The ensemble approach averages predictions from multiple fine-tuned models:
\[
y_{ensemble} = \frac{1}{N} \sum_{i=1}^{N} f_i(x)
\]
where \( y_{ensemble} \) is the ensemble prediction, \( N \) is the number of models, and \( f_i(x) \) is the prediction from the i-th model.
🚀 Key Features¶
- Parallel Execution: Models run simultaneously on different GPUs using
torch.jit.fork
- Device Mapping: Distribute models across multiple devices for memory efficiency
- Automatic Synchronization: Outputs are automatically moved to the same device for aggregation
🔧 Python Implementation¶
Here's a complete example demonstrating parallel ensemble evaluation:
🔧 YAML Configuration¶
Alternatively, you can use the ensemble method configurations:
config/method/ensemble/simple_ensemble.yaml
_target_: fusion_bench.method.SimpleEnsembleAlgorithm
device_map: null # Set to null for single device, or specify mapping
🚀 Running the Example¶
Execute the parallel ensemble evaluation: