k4d3's picture
remove resized loras because new ones are coming
7ec09e6
|
raw
history blame
No virus
2.71 kB

The Shrunk LoRA Guide


What the Heck is a Subspace Factor


The subspace factor is a measure of how well the subspace spanned by the LoRA layer's rank-update matrices (U and Vh) aligns with the subspace spanned by the weight matrix of the corresponding layer in the base model.

Specifically, the base_subspace_factors attribute computed in the DLoRA class represents the dot products between the columns of Vh and the rows of the base model weight matrix W_base, multiplied by the columns of U. These dot products measure the correlations between the subspaces spanned by the LoRA update and the base model weights.

A high subspace factor indicates that the LoRA layer is updating a subspace that is already present and important in the base model weights. Conversely, a low subspace factor suggests that the LoRA layer is introducing new directions that were not strongly represented in the base model.

The subspace_ratios attribute computes the ratio of the LoRA singular values to the absolute values of the subspace factors. This gives a sense of how much the LoRA layer is scaling the existing subspaces versus introducing new subspaces relative to the base model.

In summary, the subspace factor quantifies the alignment between the LoRA update and the base model weight subspaces, providing insight into how the LoRA layer is adapting the base model.

What is the Spectral Norm


The spectral norm (also known as the operator norm or matrix norm) is a measure of the maximum singular value of a matrix. It is calculated as:

spectral_norm(W) = σ_max(W)

Where σ_max(W) is the largest singular value of the matrix W.

The base_spectral_norm attribute of the DLoRA class is computed as:

self.base_spectral_norm = pt.svd_lowrank(W_base, q=1, niter=niter)[1][0].item()

Here, W_base is the weight matrix of the corresponding layer in the base model, flattened into a 2D matrix. pt.svd_lowrank computes an approximation of the singular value decomposition, returning the largest singular value σ_max(W_base) with q=1.

The spectral norm provides a scale for the singular values of a matrix. In the case of LoRA, the sval_ratios attribute computes the ratio of the LoRA singular values to the base model's spectral norm:

self.sval_ratios = self.S / self.base_spectral_norm

This ratio gives an indication of how much the LoRA layer is scaling the singular values of the base model weight matrix.

In summary, the spectral norm is the maximum singular value of a matrix, and it is used in the LoRA context to normalize and interpret the scale of the LoRA singular values relative to the base model weights.