last entries
random

funnel-transformer

This model uses a funnel-shaped encoder with a decoder of normal width. The model gradually compresses the hidden states in the encoder, reducing the memory footprint and increasing the speed of the model.