fts¶

Classes

FinetuningScheduler

This callback enables flexible, multi-phase, scheduled fine-tuning of foundational models.

Fine-Tuning Scheduler¶

Used to implement flexible fine-tuning training schedules

class finetuning_scheduler.fts.FinetuningScheduler(ft_schedule=None, max_depth=- 1, base_max_lr=1e-05, restore_best=True, gen_ft_sched_only=False, epoch_transitions_only=False, reinit_lr_cfg=None, allow_untested=False, apply_lambdas_new_pgs=False)[source]¶

Bases: finetuning_scheduler.fts_supporters.ScheduleImplMixin, finetuning_scheduler.fts_supporters.ScheduleParsingMixin, finetuning_scheduler.fts_supporters.CallbackDepMixin, pytorch_lightning.callbacks.finetuning.BaseFinetuning

This callback enables flexible, multi-phase, scheduled fine-tuning of foundational models. Gradual unfreezing/thawing can help maximize foundational model knowledge retention while allowing (typically upper layers of) the model to optimally adapt to new tasks during transfer learning. FinetuningScheduler orchestrates the gradual unfreezing of models via a fine-tuning schedule that is either implicitly generated (the default) or explicitly provided by the user (more computationally efficient).

Fine-tuning phase transitions are driven by FTSEarlyStopping criteria (a multi-phase extension of EarlyStopping), user-specified epoch transitions or a composition of the two (the default mode). A FinetuningScheduler training session completes when the final phase of the schedule has its stopping criteria met. See Early Stopping for more details on that callback’s configuration.

Schedule definition is facilitated via gen_ft_schedule() which dumps a default fine-tuning schedule (by default using a naive, 2-parameters per level heuristic) which can be adjusted as desired by the user and subsuquently passed to the callback. Implicit fine-tuning mode generates the default schedule and proceeds to fine-tune according to the generated schedule. Implicit fine-tuning will often be less computationally efficient than explicit fine-tuning but can often serve as a good baseline for subsquent explicit schedule refinement and can marginally outperform many explicit schedules.

Example:

from pytorch_lightning import Trainer
from pytorch_lightning.callbacks import FinetuningScheduler
trainer = Trainer(callbacks=[FinetuningScheduler()])

Note

Currently, FinetuningScheduler does not support the use of multiple FTSCheckpoint or FTSEarlyStopping callback instances.

Note

While FinetuningScheduler supports the use of ZeroRedundancyOptimizer, setting overlap_with_ddp to True is not supported because that optimizer mode only supports a single parameter group.

Define and configure a scheduled fine-tuning training session.

Parameters

ft_schedule¶ (Union[str, dict, None]) – The fine-tuning schedule to be executed. Usually will be a .yaml file path but can also be a properly structured Dict. See Specifying a Fine-Tuning Schedule for the basic schedule format. See LR Scheduler Reinitialization for more complex schedule configurations (including per-phase LR scheduler reinitialization). If a schedule is not provided, will generate and execute a default fine-tuning schedule using the provided LightningModule. See the default schedule. Defaults to None.
max_depth¶ (int) – Maximum schedule depth to which the defined fine-tuning schedule should be executed. Specifying -1 or an integer > (number of defined schedule layers) will result in the entire fine-tuning schedule being executed. Defaults to -1.
base_max_lr¶ (float) – The default maximum learning rate to use for the parameter groups associated with each scheduled fine-tuning depth if not explicitly specified in the fine-tuning schedule. If overridden to None, will be set to the lr of the first scheduled fine-tuning depth scaled by 1e-1. Defaults to 1e-5.
restore_best¶ (bool) – If True, restore the best available (defined by the FTSCheckpoint) checkpoint before fine-tuning depth transitions. Defaults to True.
gen_ft_sched_only¶ (bool) – If True, generate the default fine-tuning schedule to Trainer.log_dir (it will be named after your LightningModule subclass with the suffix _ft_schedule.yaml) and exit without training. Typically used to generate a default schedule that will be adjusted by the user before training. Defaults to False.
epoch_transitions_only¶ (bool) – If True, use epoch-driven stopping criteria exclusively (rather than composing FTSEarlyStopping and epoch-driven criteria which is the default). If using this mode, an epoch-driven transition (max_transition_epoch >= 0) must be specified for each phase. If unspecified, max_transition_epoch defaults to -1 for each phase which signals the application of FTSEarlyStopping criteria only. epoch_transitions_only defaults to False.
reinit_lr_cfg¶ (Optional[Dict]) –
A lr scheduler reinitialization configuration dictionary consisting of at minimum a nested lr_scheduler_init dictionary with a class_path key specifying the class of the lr scheduler to be instantiated. Optionally, an init_args dictionary of arguments to initialize the lr scheduler with may be included. Additionally, one may optionally include arguments to pass to PyTorch Lightning’s lr scheduler configuration LRSchedulerConfig in the pl_lrs_cfg dictionary. By way of example, one could configure this dictionary via the LightningCLI with the following:
```
reinit_lr_cfg:
    lr_scheduler_init:
        class_path: torch.optim.lr_scheduler.StepLR
        init_args:
            step_size: 1
            gamma: 0.7
        pl_lrs_cfg:
            interval: epoch
            frequency: 1
            name: Implicit_Reinit_LR_Scheduler
```
allow_untested¶ (bool) –
If True, allows the use of custom or unsupported training strategies and lr schedulers (e.g. single_tpu, MyCustomStrategy, MyCustomLRScheduler) . Defaults to False.

Note

Custom or officially unsupported strategies and lr schedulers can be used by setting allow_untested to True.

Some officially unsupported strategies may work unaltered and are only unsupported due to the Fine-Tuning Scheduler project’s lack of CI/testing resources for that strategy (e.g. single_tpu).

Most unsupported strategies and schedulers, however, are currently unsupported because they require varying degrees of modification to be compatible.

For instance, with respect to strategies, deepspeed requires an add_param_group method, tpu_spawn an override of the current broadcast method to include python objects.

Regarding lr schedulers, ChainedScheduler and SequentialLR are examples of schedulers not currently supported due to the configuration complexity and semantic conflicts supporting them would introduce. If a supported torch lr scheduler does not meet your requirements, one can always subclass a supported lr scheduler and modify it as required (e.g. LambdaLR is especially useful for this).
apply_lambdas_new_pgs¶ (bool) – If True, applies most recent lambda in lr_lambdas list to newly added optimizer groups for lr schedulers that have a lr_lambdas attribute. Note this option only applies to phases without reinitialized lr schedulers. Phases with defined lr scheduler reinitialization configs will always apply the specified lambdas. Defaults to False.

_fts_state¶

The internal FinetuningScheduler state.

Type: finetuning_scheduler.fts_supporters.FTSState

freeze_before_training(pl_module)[source]¶

Freezes all model parameters so that parameter subsets can be subsequently thawed according to the fine- tuning schedule.

Parameters: pl_module¶ (LightningModule) – The target LightningModule to freeze parameters of
Return type: None

load_state_dict(state_dict)[source]¶

After loading a checkpoint, load the saved FinetuningScheduler callback state and update the current callback state accordingly.

Parameters: state_dict¶ (Dict[str, Any]) – The FinetuningScheduler callback state dictionary that will be loaded from the checkpoint
Return type: None

on_before_zero_grad(trainer, pl_module, optimizer)[source]¶

Afer the latest optimizer step, update the _fts_state, incrementing the global fine-tuning steps taken

Parameters

trainer¶ (Trainer) – The Trainer object
pl_module¶ (LightningModule) – The LightningModule object
optimizer¶ (Optimizer) – The Optimizer to which parameter groups will be configured and added.

Return type

None

on_fit_start(trainer, pl_module)[source]¶

Before beginning training, ensure an optimizer configuration supported by FinetuningScheduler is present.

Parameters

trainer¶ (Trainer) – The Trainer object
pl_module¶ (LightningModule) – The LightningModule object

Raises

MisconfigurationException – If more than 1 optimizers are configured indicates a configuration error

Return type

None

on_train_end(trainer, pl_module)[source]¶

Synchronize internal _fts_state on end of training to ensure final training state is consistent with epoch semantics.

Parameters

trainer¶ (Trainer) – The Trainer object
pl_module¶ (LightningModule) – The LightningModule object

Return type

None

on_train_epoch_start(trainer, pl_module)[source]¶

Before beginning a training epoch, configure the internal _fts_state, prepare the next scheduled fine-tuning level and store the updated optimizer configuration before continuing training

Parameters

trainer¶ (Trainer) – The Trainer object
pl_module¶ (LightningModule) – The LightningModule object

Return type

None

restore_best_ckpt()[source]¶

Restore the current best model checkpoint, according to best_model_path

Return type: None

setup(trainer, pl_module, stage)[source]¶

Validate a compatible Strategy strategy is being used and ensure all FinetuningScheduler callback dependencies are met. If a valid configuration is present, then either dump the default fine-tuning schedule OR 1. configure the FTSEarlyStopping callback (if relevant) 2. initialize the _fts_state 3. freeze the target LightningModule parameters Finally, initialize the FinetuningScheduler training session in the training environment.

Parameters

trainer¶ (Trainer) – The Trainer object
pl_module¶ (LightningModule) – The LightningModule object
stage¶ (str) – The RunningStage.{SANITY_CHECKING,TRAINING,VALIDATING}. Defaults to None.

Raises

SystemExit – Gracefully exit before training if only generating and not executing a fine-tuning schedule.
MisconfigurationException – If the Strategy strategy being used is not currently compatible with the FinetuningScheduler callback.

Return type

None

should_transition(trainer)[source]¶

Phase transition logic is contingent on whether we are composing FTSEarlyStopping criteria with epoch-driven transition constraints or exclusively using epoch-driven transition scheduling. (i.e., epoch_transitions_only is True)

Parameters: trainer¶ (Trainer) – The Trainer object
Return type: bool

state_dict()[source]¶

Before saving a checkpoint, add the FinetuningScheduler callback state to be saved.

Returns

The FinetuningScheduler callback state dictionary: that will be added to the checkpoint

Return type

Dict[str, Any]

step()[source]¶

Prepare and execute the next scheduled fine-tuning level 1. Restore the current best model checkpoint if appropriate 2. Thaw model parameters according the the defined schedule 3. Synchronize the states of FitLoop and _fts_state

Note

The FinetuningScheduler callback initially only supports single-schedule/optimizer fine-tuning configurations

Return type: None

step_pg(optimizer, depth, depth_sync=True)[source]¶

Configure optimizer parameter groups for the next scheduled fine-tuning level, adding parameter groups beyond the restored optimizer state up to current_depth

Parameters

optimizer¶ (Optimizer) – The Optimizer to which parameter groups will be configured and added.
depth¶ (int) – The maximum index of the fine-tuning schedule for which to configure the optimizer parameter groups.
depth_sync¶ (bool) – If True, configure optimizer parameter groups for all depth indices greater than the restored checkpoint. If False, configure groups only for the specified depth. Defaults to True.

Return type

None

property curr_depth: int¶

Index of the fine-tuning schedule depth currently being trained.

Returns: The index of the current fine-tuning training depth
Return type: int

property depth_remaining: int¶

Remaining number of fine-tuning training levels in the schedule.

Returns: The number of remaining fine-tuning training levels
Return type: int