A multimodal language model capable of understanding and generating language based on text and image inputs. 27.07.2023 17:54 aior