All activity

Fuyu-8B is a multimodal model capable of...
🖼️ Visual Question Answering
🖼️ Image Captioning
🖼️ Text localization and more!
🖼️ Visual Question Answering
🖼️ Image Captioning
🖼️ Text localization and more!

Fuyu-8B
A multimodal architecture for AI agents