Summary
Low-rank adaptation can work with even fewer parameters than traditional LoRA, but only with reinforcement learning (1,2). Additionally, initialization of the adaptor matrices with values from SVD matrices outperforms those of random matrices.
Figures
Figures from (1)
Ref (2); SFT = supervised fine-tuning; GRPO = a type of reinforcement learning
1.
Bałazy K, Banaei M, Aberer K, Tabor J. LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters. Frontiers in Artificial Intelligence and Applications. 2025; Available from: https://doi.org/10.3233/faia251185
2.
Morris JX, Mireshghallah N, Ibrahim M, Mahloujifar S. Learning to Reason in 13 Parameters. 2026; Available from: https://arxiv.org/abs/2602.04118