Flash is currently in beta. Join our Discord to provide feedback and get support.
@Endpoint decorated Python functions on your local machine. Run them, and Flash automatically handles GPU/CPU provisioning and worker scaling on Runpod Serverless.
Get started
Quickstart
Write a Flash script for instant access to Runpod GPUs.
Create endpoints
Learn how to create endpoints of various types.
Examples
Browse example Flash scripts and apps on GitHub.
Setup
Install Flash
Flash requires Python 3.10, 3.11, or 3.12 (Python 3.13+ is not yet supported), and is currently available for macOS and Linux.
pip or uv:
Authentication
Before you can use Flash, you need to authenticate with your Runpod account:@Endpoint functions.
Coding agent integration (optional)
Install the Flash skill package for AI coding agents like Claude Code, Cline, and Cursor:SKILL.md file in the runpod/skills repository.
Flash apps
When you’re ready to move beyond scripts and build a production-ready API, you can create a Flash app (a collection of interconnected endpoints with diverse hardware configurations) and deploy it to Runpod. Follow this tutorial to build your first Flash app.Flash CLI
The Flash CLI provides a set of commands for managing your Flash apps and endpoints.Limitations
- Flash is currently only available for macOS and Linux. Windows support is in development.
- Serverless deployments using Flash are currently restricted to the
EU-RO-1datacenter. - Flash can rapidly scale workers across multiple endpoints, and you may hit your maximum worker threshold quickly. Contact Runpod support to increase your account’s capacity if needed.
Tutorials
Flash image generation
Build a GPU-accelerated image generation service.
Flash text generation
Deploy a text generation model on Runpod.
Flash REST API
Create HTTP endpoints with load balancing.