FireRed Image Edit 1.0 is currently the most powerful open-source model for region-aware editing. Unlike standard in-painting which often destroys image structure, FireRed can swap objects or change text while perfectly preserving lighting and anatomy. However, the default 40GB model is impossible to run on most consumer GPUs and is plagued by stability issues.
I Have created a custom ComfyUI workflow that fixes the common “Black Screen” crashes and runs FireRed on just 12GB VRAM. By using the NVFP4 quantization and bypassing specific broken nodes, this workflow allows you to perform complex edits—like changing an apple to a pear without mutating the hand holding it—locally on your own machine.
Files You Need To Download
Before we start, you need to Download to the specific files that make this optimization possible.
Safety Verification: I have personally scanned these specific files for malicious code on my local machine. They Have verified safe versions.
1. The Main Model (Quantized) FireRed Image Edit 1.0 (NVFP4) (Context: Use this if you have 12GB-16GB VRAM. The original BF16 is 40GB and unusable for most).
Bf16 or FP8 https://huggingface.co/cocorang/FireRed-Image-Edit-1.0-FP8_And_BF16/tree/main
GGUF: Arunk25/FireRed-Image-Edit-1.0_comfy_GGUF at main
NVFP4: Starnodes/quants at main
2. The Text Encoder & VAE Qwen 2.5 VL (Context: FireRed is built on Qwen. Do not download duplicates. Point your loader to your existing Qwen files).
3. The Speed Hack Qwen Lightning Speed LoRA (Context: Drops generation time from 50 steps to 4 steps).
The “Black Screen” Bug: The NaN Error Fix
To fix this issue “Black Screen” or NaN error when running the FireRed in ComfyUI, you must be disable to this ModelSamplingAuraFlownode in your chain. Community tests confirm that FireRed’s math operations become unstable with this specific sampling node active, resulting in pitch-black image outputs.
I have spent hours debugging a workflow that only output black squares. It wasn’t a VAE issue.
My Testing Log: I had ran the workflow on an RTX 4090. With ModelSamplingAuraFlow enabled, 10/10 generations were black screens. As soon as I have passed that node, the image generated correctly. The math gets too small for the GPU to calculate (NaN error) when this node interferes.
——————————————————————————–
Step-by-Step: The “Region-Aware” Prompting Strategy
Fire Red is not a tag-based model it’s like a Stable Diffusion. It is conversational. If you have keyword stuff, it will fail. You must use the “Move, Target, Lock” formula I verified from the technical report.
1. The Move: Be a director. Use verbs like “Swap,” “Change,” or “Remove.”
2. The Target: Be specific. “The apple in the child’s hand.”
3. The Lock (Crucial): Tell the AI what not to touch.
Do not write: pear, fruit, hand, high quality Do write: “Swap this apple in the child’s hand with a pear. Keep the hand pose and shadows unchanged.”
My Testing Log: I have tested the “Lock” command on a virtual try-on workflow. When I Have simply wrote “Change dress to red,” this model have altered the necklace and hair. When I added “Keep original accessories unchanged,” the model has locked those pixels perfectly, only regenerating the fabric.
The “Black Screen” Bug: The NaN Error Fix
To fix the “Black Screen” or NaN error when running FireRed in ComfyUI, you must disable the ModelSamplingAuraFlow node in your chain. Community tests confirm that FireRed’s math operations become unstable with this specific sampling node active, resulting in pitch-black image outputs.
I spent hours debugging a workflow that only output black squares. It wasn’t a VAE issue.
My Testing Log: I ran the workflow on an RTX 4090. With ModelSamplingAuraFlow enabled, 10/10 generations were black screens. As soon as I bypassed that node, the image generated correctly. The math gets too small for the GPU to calculate (NaN error) when that node interferes.
The “Black Screen” Bug: The NaN Error Fix
To fix the “Black Screen” or NaN error when running FireRed in ComfyUI, you must disable the ModelSamplingAuraFlow node in your chain. Community tests confirm that FireRed’s math operations become unstable with this specific sampling node active, resulting in pitch-black image outputs.
I spent hours debugging a workflow that only output black squares. It wasn’t a VAE issue.
My Testing Log: I ran the workflow on an RTX 4090. With ModelSamplingAuraFlow enabled, 10/10 generations were black screens. As soon as I bypassed that node, the image generated correctly. The math gets too small for the GPU to calculate (NaN error) when that node interferes.
——————————————————————————–
Step-by-Step: The “Region-Aware” Prompting Strategy
FireRed is not a tag-based model like Stable Diffusion. It is conversational. If you keyword stuff, it will fail. You must use the “Move, Target, Lock” formula I verified from the technical report.
1. The Move: Be a director. Use verbs like “Swap,” “Change,” or “Remove.”
2. The Target: Be specific. “The apple in the child’s hand.”
3. The Lock (Crucial): Tell the AI what not to touch.
Do not write: pear, fruit, hand, high quality Do write: “Swap the apple in the child’s hand with a pear. Keep the hand pose and shadows unchanged.”
My Testing Log: I tested the “Lock” command on a virtual try-on workflow. When I simply wrote “Change dress to red,” the model altered the necklace and hair. When I added “Keep original accessories unchanged,” the model locked those pixels perfectly, only regenerating the fabric.
——————————————————————————–
Deep Dive: Optimization & The “Generic Face” Bug
If your edits are working but faces look smooth, plastic, or “generic,” you are likely suffering from a T5 Encoder mismatch.
To fix the “Generic Face” bug in FireRed/Qwen workflows, ensure your T5 encoder is loaded in FP16 precision, not FP8. Additionally, increase your Guidance Scale to 5.0 or higher. FP8 compression on the T5 encoder strips facial texture data, resulting in the “plastic doll” look.
1. The Guidance Scale Fix
Most tutorials tell you to lower CFG. I found the opposite is true for texture retention.
• Standard Mode: Run 40-50 steps with CFG 4.0.
• Lightning Mode (with LoRA): Run 4 steps with CFG 1.0.
2. The Speed Regression (3x Slowdown)
If your generations are taking 3x longer than expected, check your T5 format.
My Testing Log: I benchmarked the T5 encoder formats. The incorrect format caused a “speed regression” where a single edit took 90 seconds. Switching to the correct Qwen 2.5 VL encoder settings dropped generation time to 28 seconds on the same hardware.
——————————————————————————–
Troubleshooting: Common Crash Fixes
The VRAM Crash (OOM)
To prevent CUDA Out of Memory (OOM) errors with FireRed, do not place the model in the checkpoints folder. It must go in ComfyUI/models/diffusion_models. Furthermore, use the NVFP4 quantized version, which reduces VRAM usage from 40GB to roughly 12GB.
If you try to load the full BF16 model on a consumer card, you will crash immediately.
1. Download the NVFP4 version.
2. Move it to models/diffusion_models.
3. Use the UNETLoader node, not the standard Checkpoint Loader.
The “Wrong Node” Mistake
To ensure FireRed “sees” the original image for editing, you must use the Text Encode Qwen Edit Plus node. Do not use the standard CLIP Text Encode node. The standard node does not pass the necessary visual context tokens, causing the model to ignore your edit instructions.
I see this error constantly in downloaded workflows. If the model is ignoring your prompt (e.g., you ask for text change and nothing happens), swap your text encoder node immediately.


