So here’s the thing — getting WAN 2.2 running smoothly in ComfyUI was kind of a mess, at least at first. I kept running into OOM errors, especially with anything beyond 480p. But then I found this setup that packs the model, VAE, CLIP — all of it — into a single .safetensors
file. And surprisingly, it actually worked.
This one:
👉wan2.2-i2v-rapid-aio.safetensors (22GB)
It’s basically a compressed All-In-One model you drop into the ComfyUI checkpoint loader, and that’s it — you’re done. No extra CLIP node, no VAE node, nothing else to link.
Worked for me on 8GB VRAM, and I’ve seen a few others confirm it with cards like the 3060, 4060, and even 3070s. You’ll still need decent system RAM (mine’s 32GB), but if your GPU is struggling with standard WAN 2.2, this setup honestly makes a difference.
What’s Actually in This WAN 2.2 AIO Model?
From what I could dig up, this is a mixture of WAN 2.2 and other related accelerators. Everything’s fused together into one file:
- Image-to-video weights
- CLIP
- VAE
- Text encoders
- Uses FP8 precision for lower memory usage
It’s kind of like a “best of WAN” setup, but with some version-specific quirks:
- Base: Mostly WAN 2.1. Very stable. Good starting point, but nothing special.
- V2: Adds more WAN 2.2. You’ll see slight color shifting and noise at the start of videos. Use
sa_solver
oreuler_a
. - V3: Mix of SkyReels and WAN 2.2. Better prompt matching. Still some color shifts early on.
- V4: Adds WAN 2.2 Lightning. Color issues mostly gone. Motion feels more dramatic though.
- V5: Slight tweak to V4. Still fast, still AIO, motion slightly toned down.
You don’t need to mess with anything fancy to run these. Just use the standard Load Checkpoint
node in ComfyUI and point it at the .safetensors
file. It pulls in everything it needs.
Recommended Settings
A few tips that helped me avoid issues:
- Sampler:
euler_a
orbeta
seems to be the sweet spot - CFG: Leave it at 1
- Steps: Just 4 steps total (not 8), which is weirdly fast
- Resolution: I ran 720×480 and didn’t hit any memory walls
You can add your LoRAs as usual. WAN 2.2 seems compatible with WAN 2.1 LoRAs too, though you might need to dial the strength up or down depending on the version. I used the lightx2v LoRA for faster gen times — here’s that one:
👉 lightx2v
So… Does It Actually Work on 8GB?
Yeah. I ran it on a 3070 with 8GB VRAM, no crashes, no memory errors. Render times per frame were around 45s with 4 steps. This isn’t the high-noise + low-noise dual pass setup WAN 2.2 usually runs, so quality is a bit softer. But it’s fun and fast and, more importantly, stable.
If you’ve ever had ComfyUI crash the second a video finishes rendering — this avoids that. No separate models, no juggling LoRAs, no manual VAE paths.
This might not be the best-quality setup, but for a lot of people — especially those on limited GPUs — it’s just a relief to have something that works out of the box.
Real-world results, Reddit reactions, and LoRA setups that work
So I started digging into what other people were saying about the WAN 2.2 AIO model on Reddit — and yeah, it’s pretty much the same story. If you’ve been struggling to get WAN 2.2 running on mid-range GPUs, this setup is what finally worked for a lot of folks.
One guy said he ran it on a 3060 (12GB) and got similar speed as Q4 WAN 2.1. Another used it on a 4060 with 32GB RAM, no memory crashes at all. Even someone on 8GB VRAM managed to run full video without hitting out-of-memory errors. Which, if you’ve messed with this stuff before, sounds impossible. But that’s what’s happening.
There’s also a bit of a debate about quality.
Some people say the AIO model looks more like WAN 2.1 — kinda lower fidelity, less sharp motion. But then others are getting surprisingly clean prompt results, especially on V3 and V4 variants. I saw a few posts mention that motion is slightly overdone, especially in the lightning version (V4), but that can be balanced with the right sampler.
Someone using a 3090 mentioned adding a little bit of negative weight to the LightX2V LoRA (like -0.6
) helped avoid overexposed high-contrast output — especially when you’re doing straight I2V generations without the high/low noise pass.
Speed tips people are using with this WAN 2.2 AIO setup
These came straight from user comments, and they actually helped:
- Stick with Euler a or Beta sampler — people got the most stable results with those
- 4 steps is really all you need (not 8) — that’s part of what makes this fast
- Use Lightx2v LoRA with strength around
2.5
for high noise, and around1
for low - You can get it running on ComfyUI’s default “Load Checkpoint” node, nothing special
- Don’t skip system RAM — 32GB RAM seems to keep everything smooth for most people
That combo — WAN 2.2 AIO + Kijai’s FP8 quant + lightx2v LoRA — seems to be the setup that worked on most 8–12GB VRAM cards.
GGUF or not?
A few people brought up GGUF as an alternative — especially if you’re running low-end hardware or trying to push quantization. There’s a WAN 2.2 GGUF floating around, but honestly, for this particular AIO build, the regular .safetensors
file just works.
You can always try swapping in GGUF quantized versions if you’re tight on VRAM. Some recommended using Hunyuan GGUF if you’re stuck on HuggingFace-style models, but for WAN, this AIO just simplifies everything.
The catch?
Here’s the part I didn’t expect: this AIO setup actually skips the traditional 2-pass (high + low noise) workflow. So yeah, it’s fast and runs on limited VRAM — but image quality takes a hit. You’re not getting the same crispness or dynamics as full high/low workflows with separate models.
But if your options are:
- 3-hour render times at 540p Vs
- 720p videos in 4 steps that finish in under 3 minutes
…then yeah, it’s an easy trade-off for most people.
A few Reddit highlights worth noting:
“Worked flawlessly with my 3060 12GB. I get the same speed as q4 WAN 2.1.”
— u/mk8933
“If you have something working, then that’s good. This solution is for people who are struggling to make it run.”
— OP
“ComfyUI doesn’t crash after every video render anymore. No more OOM.”
— u/Icy_Restaurant_8900
“Bro you’re doing god’s work 🙌”
— u/mk8933
“This 1 model has everything included — VAE, CLIP — just drag in the JSON and start.”
— u/mk8933 again (dude really tested it thoroughly)
And if you’re trying to run it in WSL2, someone did mention stability issues with high + low setups crashing after every video — switching to this AIO model fixed that for them.
AIO vs. full WAN 2.2 — What you gain and what you lose
Here’s the big question: Why even use WAN 2.2 AIO when the full high/low model gives you better results?
The answer is simple: speed and stability.
If you’re running a mid-range GPU — like a 3060, 4060, or even an 8GB card — the traditional WAN 2.2 setup with two separate models is a VRAM killer. You’ve got two passes (high noise and low noise), bigger samplers, more memory swaps — and it can easily crash halfway through video rendering, especially with batch nodes or long frame counts.
The AIO model solves all that by combining everything:
- No need to load separate CLIP/AutoEncoder nodes
- No need for 2-pass workflows
- No need to juggle VAE, noise schedules, LoRA strengths separately
I made a quick video tutorial showing Qwen-image workflow inside ComfyUI. You can watch it
How to Set Up WAN 2.2 AIO in ComfyUI (The Right Way)
If you want to run WAN 2.2 AIO locally inside ComfyUI, the setup’s actually super simple
All you need is the right model file, the right folder structure, and a few settings in your workflow.
Let’s break it down
Which WAN AIO Model to Use?
There are two main branches:
- 👉
wan2.2-i2v-rapid-aio.safetensors
– for Image-to-Video (I2V) - 👉
wan2.2-t2v-rapid-aio.safetensors
– for Text-to-Video (T2V)
Then you’ll see four versions inside folders:
V2, V3, V4, V5 — each one has slightly different behavior.
Once you choose your version, download one file only — it’s around 22GB.
Where to Put the File
Save your model into:
ComfyUI/models/checkpoints/
Then download this required extra file:
Put that into:
ComfyUI/models/clip_vision/
That’s it for files. No other components are needed — WAN AIO is all-in-one:
- No separate CLIP node
- No VAE
- No dual samplers
It’s all baked into that single.safetensors
file.
Prompt Test — Cat vs Gorillas in the Rainforest
For this one, I wanted to push the model with a weird, fantasy-style prompt — something no LoRA is trained on. No anime girls, no classic movie scenes. Just raw imagination.
Here’s the exact prompt I used:
A tiny black cat walking through a lush green rainforest.
In the distance, a group of angry gorillas watch from the trees.
The cat walks confidently, tail flicking. Slow-motion.
Light rays pierce through the mist. Everything feels cinematic.
I ran this using Rapid AIO (WAN 2.2) at 640×640, 81 frames, Euler a, cfg=1
, and FP8 mode.
The first result?
Way better than expected.
- The cat? Clearly visible, walking frame by frame with a confident tail flick.
- The rainforest? Lush, colorful, with deep green tones and real depth.
- The mist and light rays? Absolutely nailed — it had that moody cinematic vibe.
- The gorillas? Sort of there. More like shadowy figures — not perfect, but definitely “watching.”
Final Thoughts on AIO FP8 for Fantasy Prompts
This was one of the cleanest fantasy tests I’ve done on a low-VRAM card.
No LoRA Just one prompt, pure render
Honestly, I didn’t expect FP8 to pull this off — but it did.
Me funciono sin inconvenientes en mi gpu rtx 2080 – 8 gb – 64 RAM, muchas gracias por compartir esta información y demás detalles de configuración!!!