← Back to Top 10
Google

Gemma 2

State-of-the-art performance in lightweight packages, built on Gemini research.

Generic Info

  • Publisher: Google
  • Release Date: June 2024
  • Parameters: 2B, 9B, 27B
  • Context Window: 8k tokens
  • License: Gemma Terms of Use
  • Key Capabilities: Efficient Inference, Reasoning, Summarization

Gemma 2 models punch above their weight class. The 9B and 27B variants offer reasoning capabilities that rival much larger models, making them perfect for deployment on consumer hardware or in cost-sensitive environments.

Hello World Guide

Run Gemma 2 with transformers.

Python
from transformers import pipeline
import torch

pipe = pipeline(
    "text-generation",
    model="google/gemma-2-9b-it",
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "user", "content": "Explain the theory of relativity to a 5-year-old."},
]

outputs = pipe(
    messages,
    max_new_tokens=256,
)

print(outputs[0]["generated_text"][-1])

Industry Usage

On-Device AI

The 2B and 9B models are optimized for running locally on laptops and desktops for privacy-preserving apps.

RAG Pipelines

Used as efficient generators in Retrieval-Augmented Generation systems where speed and cost are critical.

Academic Research

A popular choice for researchers due to its accessible size and strong performance baseline.