What is Multimodal AI, anyway?
Clearly, we see this MMAI everywhere nowadays. Enter a prompt in Midjourney and you now get an image. From VQGAN-CLIP who had released her code several years ago to offer the first example of using text prompts to generate images, to, RunwayGen that makes videos (like this 4-sec clip I