Gemma 3 - Build with multimodal AI from Google

Ambassador

Gemma 3 is Google's new models for multimodal AI (text, images, video). 1B-27B sizes, 128K context, 140+ languages. Includes ShieldGemma 2 for safety.

Replies

Best

Zac Zuo

Ambassador

Hunter

📌

Hi everyone!

Check out Gemma 3, Google's latest family of models for building multimodal AI applications! This is a big step up from the previous Gemma versions, adding video understanding and a much larger context window.

Key features:

🖼️ Multimodal: Handles text, images, and short videos.
🧠 Multiple Sizes: Available in 1B, 4B, 12B, and 27B parameter versions.
↔️ 128K Context Window: A major increase, allowing for processing much more information.
🌍 Multilingual: Supports over 35 languages out-of-the-box, pretrained on over 140.
🛠️ Integrates with Hugging Face Transformers, Ollama, JAX, Keras, PyTorch, Unsloth, vLLM, and Gemma.cpp.
🛡️ It Includes a separate 4B model, ShieldGemma 2, for image safety classification.
⚡ Optimized for NVIDIA GPUs, Google Cloud TPUs, and AMD GPUs.

Gamma 3 is a clear sign of how quickly the multimodal AI space is advancing.

Let's start exploring its capabilities in Google AI Studio!

Report

5mo ago

Stain Lu

Grimo

@zaczuo I have to say Gemini 2 powered by Gemma3 is just incredible in its multimodality capability! The image generating and editing in the chat interface blew me away - I was able to create and modify images right in the conversation flow without switching between tools and uploading and downloading repeatingly,, This kind of seamless integration between text and visual creation is exactly what I've been waiting for in AI assistants. The quality and speed of the image generation is impressive too, much more responsive than other multimodal models I've tried. grats on the launch!

Report

5mo ago

Rami

Go Mail Merge

can we run this on our own device or is it available on via the API?

Report

5mo ago

Zac Zuo

Ambassador

Hunter

@kingromstar Yes, you can run it locally, the 1B and 4B models are designed for that. Plz check out this.

Report

5mo ago

Henry Habib

Google is not stopping. This is a solid addition to the multimodal space and makes me wonder what cool stuff could be a good starting point to build with it.

Report

5mo ago

Stain Lu

Grimo

I have to say Gemini 2 powered by Gemma3 is just incredible in its multimodality capability! The image generating and editing in the chat interface blew me away - I was able to create and modify images right in the conversation flow without switching between tools and uploading and downloading repeatingly,, This kind of seamless integration between text and visual creation is exactly what I've been waiting for in AI assistants. The quality and speed of the image generation is impressive too, much more responsive than other multimodal models I've tried. grats on the launch!

Report

5mo ago

Mike Staub

It looks like Google is going to win the AI race.

Report

5mo ago

Tim Chosen

CV Bot

This Sounds interesting, will give it a spin

Report

5mo ago