Google TPUs documentation

Find More Examples on the Optimum-TPU GitHub Repository

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Find More Examples on the Optimum-TPU GitHub Repository

To find the latest examples, visit the examples folder in the optimum-tpu repo on github

Text Generation

Learn how to perform efficient inference for text generation tasks:

  • Basic Generation Script (examples/text-generation/generation.py)
    • Demonstrates text generation using models like Gemma and Mistral
    • Features greedy sampling implementation
    • Shows how to use static caching for improved performance
    • Includes performance measurement and timing analysis
    • Supports custom model loading and configuration

Language Model Fine-tuning

Explore how to fine-tune language models on TPU infrastructure:

  1. Interactive Gemma Tutorial (view in the docs)
    • Complete notebook showing Gemma fine-tuning process
    • Covers environment setup and TPU configuration
    • Demonstrates FSDPv2 integration for efficient model sharding
    • Includes dataset preparation and PEFT/LoRA implementation
    • Provides step-by-step training workflow

The full notebook is available at examples/language-modeling/gemma_tuning.ipynb

  1. LLaMA Fine-tuning Guide (view in the docs)
    • Detailed guide for fine-tuning LLaMA-2 and LLaMA-3 models
    • Explains SPMD and FSDP concepts
    • Shows how to implement efficient data parallel training
    • Includes practical code examples and prerequisites

The full notebook is available at examples/language-modeling/llama_tuning.ipynb

Additional Resources

To contribute to these examples, visit our GitHub repository.