Run Stable Diffusion as an API that scales to zero

Why serverless changes the GPU math completely

The stock SDXL template is fine for demos — terrible if you want your own art style. Lesson 3 loads a custom model.

On-demand pods are easy but expensive — you pay for every second even when idle. Serverless is the opposite: you pay only when a request is running, a...

In the RunPod console, click **Serverless** → **+ New Endpoint**. Under templates, search `Stable Diffusion XL` and pick the official RunPod SDXL temp...

Set these values: - **GPU Type**: A100 80GB (needed for SDXL at 1024x1024) - **Min Workers**: 0 (scale to zero when idle — this is the whole point) - ...

Next up

Run Stable Diffusion as an API that scales to zero

The stock SDXL template is fine for demos — terrible if you want your own art style. Lesson 3 loads a custom model.

A single H100 on AWS is $4.50/hour and runs 24/7 whether you need it or not. On RunPod serverless you pay $0.00076 per second — only when a request is actually running. Most side projects come in under $15/month.

Unlock with RunPod Pay-as-you-go (~$0.50/hr GPU) →

Your click sets an affiliate cookie that auto-unlocks the course. Same price, no extra cost.