Summary
Positional encoding refers to the process of adding information about a token’s absolute or relative position in a sequence. This is particularly important for transformers which lack any built-in knowledge of position.
Absolute positional encoding
Sinusoidal positional encodings: Used in the original transformer (1). They found performance competitive with learned encodings.
: The index of the specific entry : The position of the token
Ref https://production-media.paperswithcode.com/methods/05577c08-d6ac-4b8b-9fd0-55739ba42383.png
Relative positional encodings
Rotational positional encodings: Rotates the queries and keys prior to calculation of attention.
Ref (2)
1.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention Is All You Need. 2017; Available from: https://arxiv.org/abs/1706.03762
2.
Su J, Ahmed M, Lu Y, Pan S, Bo W, Liu Y. RoFormer: Enhanced transformer with Rotary Position Embedding. Neurocomputing. 2024;568:127063. Available from: https://doi.org/10.1016/j.neucom.2023.127063