Gemma 2

State-of-the-art performance in lightweight packages, built on Gemini research.

Generic Info

Publisher: Google
Release Date: June 2024
Parameters: 2B, 9B, 27B
Context Window: 8k tokens
License: Gemma Terms of Use
Key Capabilities: Efficient Inference, Reasoning, Summarization

Gemma 2 models punch above their weight class. The 9B and 27B variants offer reasoning capabilities that rival much larger models, making them perfect for deployment on consumer hardware or in cost-sensitive environments.

Hello World Guide

Run Gemma 2 with transformers.

Python

from transformers import pipeline
import torch

pipe = pipeline(
    "text-generation",
    model="google/gemma-2-9b-it",
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "user", "content": "Explain the theory of relativity to a 5-year-old."},
]

outputs = pipe(
    messages,
    max_new_tokens=256,
)

print(outputs[0]["generated_text"][-1])

Industry Usage

On-Device AI

The 2B and 9B models are optimized for running locally on laptops and desktops for privacy-preserving apps.

RAG Pipelines

Used as efficient generators in Retrieval-Augmented Generation systems where speed and cost are critical.

Academic Research

A popular choice for researchers due to its accessible size and strong performance baseline.