It is a differentiable approximation of the Argmax function, useful in cases where one wants to sample from a categorical distribution, but also wants to keep the operation differentiable. 27.07.2023 17:54 aior