Summary

Low-rank adaptation (LoRA) is an approach to speed up fine-tuning of neural networks, typically transformers such as language models, originally proposed by (1). In contrast with standard fine-tuning that updates all weights, LoRa updates only a few via: where and are two small weight matrices.

Details

  • is proposed to be mostly equivalent to (the derivative of that is typically calculated during backdrop)
  • and will have a shape and where is much less than or
  • In practice, the new weight matrix can be kept separate from the original matrix
  • Two hyper parameters:
  • : The rank of the LoRA matrices
  • : Scales the influence of the matrices relative to the original weight matrix

Figures

From https://magazine.sebastianraschka.com/p/lora-and-dora-from-scratch

See also

1.
Hu EJ, Shen Y, Wallis P, Allen-Zhu Z, Li Y, Wang S, et al. LoRA: Low-Rank Adaptation of Large Language Models. In: ICLR 2022. 2022. Available from: https://openreview.net/forum?id=nZeVKeeFYf9