This is an extension of the Transformer-XL model, which overcomes some of the limitations of BERT. XLNet uses a permutation-based training strategy rather than the masked language model used in BERT. 27.07.2023 17:54 aior