Fuyu-8B - A multimodal architecture for AI agents
Fuyu-8B is a multimodal model capable of...
🖼️ Visual Question Answering
🖼️ Image Captioning
🖼️ Text localization and more!
Fuyu-8B is a multimodal model capable of...
🖼️ Visual Question Answering
🖼️ Image Captioning
🖼️ Text localization and more!