Summary

Positional encoding refers to the process of adding information about a token’s absolute or relative position in a sequence. This is particularly important for transformers which lack any built-in knowledge of position.

Absolute positional encoding

Sinusoidal positional encodings: Used in the original transformer (1). They found performance competitive with learned encodings.

: The index of the specific entry : The position of the token

Ref https://production-media.paperswithcode.com/methods/05577c08-d6ac-4b8b-9fd0-55739ba42383.png

Relative positional encodings

Rotational positional encodings: Rotates the queries and keys prior to calculation of attention.

Ref (2)

1.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention Is All You Need. 2017; Available from: https://arxiv.org/abs/1706.03762
2.
Su J, Ahmed M, Lu Y, Pan S, Bo W, Liu Y. RoFormer: Enhanced transformer with Rotary Position Embedding. Neurocomputing. 2024;568:127063. Available from: https://doi.org/10.1016/j.neucom.2023.127063