Assess Model Performance During Algorithm Execution¶
This tutorial demonstrates how to evaluate model performance during the merging process using FusionBench's TaskPool system. You'll learn how to integrate evaluation at different stages of your algorithm, monitor performance throughout the merging process, and save intermediate results for analysis.
TaskPools in FusionBench manage evaluation datasets and provide standardized interfaces for assessing model performance. The most common is CLIPVisionModelTaskPool for vision models:
fromfusion_bench.taskpoolimportCLIPVisionModelTaskPoolfromcopyimportdeepcopy# Access taskpool from the program contexttaskpool=self._program.taskpool# Available in algorithm classes
importtorchfromcopyimportdeepcopyfrompathlibimportPathfromfusion_benchimportBaseAlgorithmfromfusion_bench.taskpoolimportCLIPVisionModelTaskPoolfromfusion_bench.utils.jsonimportsave_to_jsonclassEvaluatingMergingAlgorithm(BaseAlgorithm):def__init__(self,evaluate_on_every_step:bool=True,**kwargs):super().__init__(**kwargs)self.evaluate_on_every_step=evaluate_on_every_step@torch.no_grad()defrun(self,modelpool):# Access the taskpool from the programtaskpool=self._program.taskpool# Store original test datasets for restorationoriginal_test_datasets=deepcopy(taskpool._test_datasets)model_names=modelpool.model_namesmerged_model=modelpool.load_model(model_names[0])# Evaluate initial modelifself.evaluate_on_every_step:report=self._evaluate_model(taskpool,merged_model,model_names[0],step=0)# Iterative merging with evaluationforstep,model_nameinenumerate(model_names[1:],1):# Load and merge next modelnext_model=modelpool.load_model(model_name)merged_model=self._merge_models(merged_model,next_model)# Evaluate merged modelifself.evaluate_on_every_step:# Update taskpool to include models merged so farcurrent_models=model_names[:step+1]report=self._evaluate_model(taskpool,merged_model,current_models,step)# Restore original taskpool statetaskpool._test_datasets=original_test_datasetstaskpool._is_setup=Falsereturnmerged_modeldef_evaluate_model(self,taskpool,model,model_names,step):"""Evaluate model and save results."""# Reset taskpool setup to reconfigure with new datasetstaskpool._is_setup=False# Configure taskpool for current set of modelsifisinstance(model_names,list):# Multiple models - evaluate on their respective datasetscurrent_datasets={name:taskpool._test_datasets[name]fornameinmodel_namesifnameintaskpool._test_datasets}else:# Single modelcurrent_datasets={model_names:taskpool._test_datasets[model_names]}# Update taskpool configurationfromomegaconfimportDictConfigtaskpool._test_datasets=DictConfig(current_datasets)# Run evaluationreport=taskpool.evaluate(deepcopy(model))# Save resultsifhasattr(self,'log_dir')andself.log_dir:save_path=Path(self.log_dir)/f"report_{step}.json"save_to_json(report,save_path)returnreportdef_merge_models(self,model1,model2):"""Implement your merging logic here."""# This is a placeholder - implement your actual merging algorithmpass
The Orthogonal Projection-based Continual Merging (OPCM) algorithm provides an excellent example of evaluation during merging. For more information, refer to the OPCM implementation.