BigBird, developed by Google, is designed to handle longer context. It combines sparse attention mechanisms with the original transformer model to efficiently handle sequences of length up to 8K tokens. 27.07.2023 17:54 aior