Running LTX 2.3 in ComfyUI usually causes an Out of Memory (OOM) error on an 8GB GPU. However, by changing how ComfyUI loads and processes your models, you can safely keep your VRAM under 8GB.
I have tested this exact workflow using a live performance monitor. It allows you to generate a full video in about one minute while keeping your VRAM usage safe.
Here is how you set it up.
Download sources
| File | Source |
|---|---|
ltx-2.3-22b-distilled-1.1-Q4_K_M.gguf | Unsloth LTX 2.3 GGUF distilled 1.1 Q4_K_M, about 14.2GB. |
gemma-3-12b-it-qat-UD-Q3_K_XL.gguf | Unsloth Gemma 3 12B QAT GGUF Q3, about 6.14GB. |
ltx-2.3_text_projection_bf16.safetensors | Kijai LTX2.3 Comfy text projection, about 2.31GB. |
LTX23_video_vae_bf16.safetensors | Kijai LTX2.3 Comfy VAE folder, video VAE about 1.45GB. |
LTX23_audio_vae_bf16.safetensors | Kijai LTX2.3 Comfy VAE folder, audio VAE about 365MB. |
ltx-2.3-spatial-upscaler-x2-1.1.safetensors | Lightricks LTX 2.3 spatial upscaler x2 v1.1, about 996MB. |
Exact folder structure
Use this:
ComfyUI/
└── models/
├── unet/
│ └── ltx-2.3-22b-distilled-1.1-Q4_K_M.gguf
│
├── text_encoders/
│ ├── gemma-3-12b-it-qat-UD-Q3_K_XL.gguf
│ └── ltx-2.3_text_projection_bf16.safetensors
│
├── vae/
│ └── ltx/
│ ├── LTX23_video_vae_bf16.safetensors
│ └── LTX23_audio_vae_bf16.safetensors
│
└── latent_upscale_models/
└── ltx-2.3-spatial-upscaler-x2-1.1.safetensors
The Essential ComfyUI Startup Commands
Before you load any models, you must force ComfyUI into an aggressive memory-saving mode. If you skip this step, an 8GB GPU will crash immediately.
If you use the portable version of ComfyUI, right-click your run_nvidia_gpu.bat file and click edit. You need to paste these exact commands right after main.py:
- –lowvram: This is your most important command. It tells ComfyUI to manage your limited video memory carefully.
- –reservevram 1 (or 2): This keeps 1 to 2 gigabytes of memory free for Windows. This step is designed to help prevent your entire computer from crashing. (Note: If you have a larger card and want to simulate an 8GB limit for testing, you can increase this number. For example, I tested this on a 32GB card using –reserve-vram 24 to leave exactly 8GB for ComfyUI.)
- –cache-lru 10: This limits your cache memory usage to prevent unnecessary spikes.
- –preview-method taesd: This changes your live preview to a very lightweight version. If you still get an Out of Memory error later on, you can change this to
--preview-method noneto save even more memory
python main.py --lowvram --reserve-vram 1 --cache-lru 10 --preview-method taesd
The Best Models for a Low VRAM Workflow
You must choose the right model files to save memory. A standard setup will crash an 8GB card, so you need compressed versions.
Use the Distilled Q4_K_M Model First, use the distilled 1.1 Q4_K_M model. This saves a massive amount of VRAM because you do not need to load an extra LoRA file.
Compress with the Gemma 3 GGUF File Next, move to the DualCLIPLoader. Load the Gemma 3 GGUF file. Using the GGUF version compresses a massive AI model down to just 6.14 gigabytes. This is exactly how we force the heavy text encoder to fit inside low VRAM.
You will also need the LTX 2.3 text projection file (2.31 gigabytes) and two VAE files.
Required Low VRAM Patches for ComfyUI
Even with compressed models, you need to manage your memory while the video is actively generating.
You should add a “Low VRAM Patches” group to your ComfyUI workflow. Take your loaded model and pass it through these three patches:
- Sage Attention
- Memory-Efficient Attention
- Chunk Feed-Forward
These nodes reduce your peak memory usage. Because we saved so much memory, you should also add the LTX NAG node to stop the AI from generating bad anatomy.
The Two-Pass Generation Trick (The Secret to 8GB)
This is the most important step. I divide the video generation into two separate passes. If you try to do it all at once, you risk a crash.
Pass 1: Safe Base Generation Bypass the second pass and the upscaler. Set your first pass to do the first six steps.
When you hit run, the text encoder takes about 3.7GB of VRAM, and the prompt section takes about 5.6GB. During the first pass, your VRAM will spike to exactly 8.1GB. This takes about 44 seconds. After it finishes, your VRAM will drop safely back down to around 3.6GB.
Pass 2: Finishing the Video Now, enable the second pass and hit run again.
Because you used the correct cache settings, ComfyUI will not start from the beginning. It starts exactly where the first pass ended. The second pass runs two more steps. It will take around 8GB of VRAM again and finish in about 15 seconds.
How to Upscale Safely Without Crashing
If you want a higher resolution, do not use the built-in ComfyUI upscale section. I tested this, and the VRAM spikes to around 14GB. While it will not fail, it takes a very long time on an 8GB GPU.
Instead, use this better method:
- Download your generated LTX 2.3 video.
- Load that video back into a fresh ComfyUI workflow.
- Use the ‘RTX Video Super Resolution’ node.
This keeps your system fast and prevents your VRAM from overloading.
Bonus Tip: Use –preview-method none
If you still get an Out of Memory error during your generations, you can apply a simple command line trick.
Open your run_nvidia_gpu.bat file, click edit, and paste --preview-method none after your main.py command. This stops ComfyUI from rendering live previews, which saves even more video memory.
