Skip to content

BitDelta

arXiv GitHub Repo stars

alt text
BitDelta applies 1-bit quantization to the weight delta between fine-tuned and base models. For each weight matrix, we quantize its delta as its sign bits and a trainable high-precision scale factor. The scale factor is initialized to achieve the best approximation error in L2 norm and further refined with a few distillation steps. BitDelta shows minimal degradation in model performance and reduces memory consumption in multi-tenancy serving by representing multiple fine-tuned models with a single high-precision base model and multiple 1-bit deltas.

Default Configurations

config/method/bitdelta/bitdelta.yaml
_target_: fusion_bench.method.bitdelta.BitDeltaAlgorithm
save_dir: null
save_full_model: false
# training arguments
lr: 1e-4
batch_size: 4
num_steps: 100
# dataset arguments
dataset_name: c4
subset: en
split: train
max_length: 128

Example Usage

fusion_bench method=bitdelta/bitdelta modelpool=CausalLMPool/vicuna-7b-v1.5

Implementation Details