Gemma 2
State-of-the-art performance in lightweight packages, built on Gemini research.
Generic Info
- Publisher: Google
- Release Date: June 2024
- Parameters: 2B, 9B, 27B
- Context Window: 8k tokens
- License: Gemma Terms of Use
- Key Capabilities: Efficient Inference, Reasoning, Summarization
Gemma 2 models punch above their weight class. The 9B and 27B variants offer reasoning capabilities that rival much larger models, making them perfect for deployment on consumer hardware or in cost-sensitive environments.
Hello World Guide
Run Gemma 2 with transformers.
Python
from transformers import pipeline
import torch
pipe = pipeline(
"text-generation",
model="google/gemma-2-9b-it",
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
messages = [
{"role": "user", "content": "Explain the theory of relativity to a 5-year-old."},
]
outputs = pipe(
messages,
max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])
Industry Usage
On-Device AI
The 2B and 9B models are optimized for running locally on laptops and desktops for privacy-preserving apps.
RAG Pipelines
Used as efficient generators in Retrieval-Augmented Generation systems where speed and cost are critical.
Academic Research
A popular choice for researchers due to its accessible size and strong performance baseline.