NVIDIA · AI GPU Review

RTX 3090

Our complete AI-workload review of the RTX 3090: VRAM analysis, Flux & ComfyUI throughput, local LLM performance, power efficiency, and workstation fit.

24 GB
VRAM
1.6 it/s
Flux 1024
16 t/s
L3 70B Q4
350 W
TDP

Overview

The RTX 3090 delivers 58/100 on our composite AI workload index — a balanced view of Flux generation throughput, SDXL inference, and quantized local LLM performance. With 24GB of VRAM and a 350W TDP, it's positioned for serious AI creators running large workflows locally.

Pros & Cons

Pros

  • 24GB VRAM handles most Flux & SDXL workflows
  • 1.6 it/s on Flux.1 dev FP16
  • Excellent CUDA ecosystem support
  • Strong resale value

Cons

  • 350W TDP requires serious PSU
  • Premium pricing
  • Limited stock at MSRP

Performance benchmarks

Flux.1 dev FP16 · 1024² · 25 steps
1.6 it/s
SDXL Base · 1024² · 20 steps
7.4 it/s
Llama 3 70B · Q4_K_M · 2k ctx
16 tok/s
Hunyuan Video · 720p · 5s
0.48 it/s

VRAM analysis

With 24GB of VRAM, the RTX 3090 can comfortably run: Flux.1 dev FP16 with all loaders resident, SDXL with multiple LoRAs, and Llama 3 70B at Q4 quantization.

Flux performance

At 1.6 it/s, a standard Flux.1 dev 25-step generation completes in ~15.6 seconds. For batch workflows, expect linear scaling up to VRAM limits.

ComfyUI performance

SDXL throughput of 7.4 it/s makes the RTX 3090 viable for iterative ComfyUI workflows. Block-swap and tiled VAE are rarely needed.