flash run command starts a local development server that lets you test your Flash application before deploying to production. The development server runs locally and updates automatically as you edit files.
When you call a @Endpoint function, Flash sends the latest function code to Serverless workers on Runpod, so your changes are reflected immediately.
Start the development server
From inside your project directory, run:http://localhost:8888 by default. Your endpoints are available immediately for testing, and @Endpoint functions provision Serverless endpoints on first call.
Using a custom host and port
Test your endpoints
Using curl
Using the API explorer
Open http://localhost:8888/docs in your browser to access the interactive Swagger UI. You can test all endpoints directly from the browser.Using Python
Reduce cold-start delays
The first call to a@Endpoint function provisions a Serverless endpoint, which takes 30-60 seconds. Use --auto-provision to provision all endpoints at startup:
@Endpoint functions and deploys them before the server starts accepting requests. Endpoints are cached in .runpod/resources.pkl and reused across server restarts.
How it works
Withflash run, Flash starts a local development server alongside remote Serverless endpoints:
What runs where:
| Component | Location |
|---|---|
| Development server | Your machine (localhost:8888) |
@Endpoint function code | Runpod Serverless |
| Endpoint storage | Runpod Serverless |
flash run are prefixed with live- to distinguish them from production endpoints.
Clean up after testing
Endpoints created byflash run persist until you delete them. To clean up:
Troubleshooting
Port already in use--auto-provision to eliminate cold-start delays:
RUNPOD_API_KEY is set in your .env file or environment:
Next steps
- Deploy to production when your app is ready.
- Clean up endpoints after testing.
- View the flash run reference for all options.