How to Fix the Generic Face Bug in BitDance 14B & Optimize Speed

Esha Sharma
6 Min Read

If your BitDance 14B generations crash your GPU with Out of Memory (OOM) errors, your load settings are misconfigured. The default 14-Billion parameter model is massive, but I quantized it to FP8 and built a custom ComfyUI node so it actually runs smoothly on consumer hardware. Here is the exact workflow I use to fix the bugs, stop the crashes, and get near-BF16 quality locally

Files to Download (Safety First)

Custom Node: ComfyUI-BitDance

    ◦ Context: This is the custom loader node I built with advanced memory logic to prevent OOM crashes on low VRAM GPUs. You must install this for the workflow to function.

    ◦ Download / Install: Clone my open-source code on GitHub here

File Name: BitDance_14B_MainModel_FP8.safetensors

    ◦ Context: The main diffusion model, compressed to FP8 to save VRAM. Place this in ComfyUI/models/diffusion_models/.

    ◦ Safety Check: I have scanned this locally. Safe to use.

    ◦ Download: https://huggingface.co/comfyuiblog/BitDance-14B-64x-fp8-comfyui/blob/main/BitDance_14B_MainModel_FP8.safetensors

File Name: BitDance_TextEncoder_FP8.safetensors

    ◦ Context: The required language backbone for text processing. Place this in ComfyUI/models/text_encoders/.

    ◦ Safety Check: I have scanned this locally. Safe to use.

    ◦ Download: https://huggingface.co/comfyuiblog/BitDance-14B-64x-fp8-comfyui/blob/main/BitDance_TextEncoder_FP8.safetensors

File Name: BitDance_VAE_FP16.safetensors

    ◦ Context: The VAE for decoding the binary tokens. Place this in ComfyUI/models/vae/.

    ◦ Safety Check: I have scanned this locally. Safe to use.

    ◦ Download: https://huggingface.co/comfyuiblog/BitDance-14B-64x-fp8-comfyui/blob/main/BitDance_VAE_FP16.safetensors

Step-by-Step BitDance Workflow Setup

I built advanced logic into the BitDance Loader node to protect your hardware. Follow these exact settings to get it running smoothly:

  1. Configure the BitDance Loader: Select your three downloaded files inside the loader. You must set the Quantization dropdown to fp8_e4m3fn_scaled to use the optimized FP8 format.
  1. Optimize Memory (VRAM): You need to manage your memory using the load_device, text_encoder_load_device, and vae_load_device settings.

    ◦ My Testing Log: I found that explicitly setting these components to ‘offload’ moves them out of the way, stopping your computer from running out of memory during generation.

  1. Dial in the Sampler Settings: Set your um_sampling_steps between 20 and 50, and lock your CFG (classifier-free guidance) at 7.5.
  1. Select the Correct Sampler: You must use the euler_maruyama sampler.

    ◦ Why this matters: I found that standard samplers fail. Because BitDance embeds its massive binary tokens onto a continuous hypercube, it relies on a diffusion velocity-matching objective and literally needs an Euler solver to decode the invisible tokens perfectly.

  1. Write Natural Language Prompts: Do not use weird AI tag words. BitDance is a pure text-to-image model built on a massive language backbone. Just use clear, everyday English with specific details (e.g., “A cinematic portrait of a beautiful Chinese woman with wind-blown black hair”).

Troubleshooting & Workflow Optimization

How to Fix the “Generic Face” Bug

To fix the BitDance “Generic Face” bug, increase the Guidance Scale to 7.5 and ensure your text encoder is set to FP16. Using FP8 on the text encoder causes a known regression in facial detail, resulting in a smooth, plastic appearance.

Step 1: Locate the BitDance Text Encoder node in your workflow.

Step 2: Swap your loaded model from the FP8 version to the FP16 .safetensors version.

Step 3: Lock your CFG (Guidance Scale) on the sampler to exactly 7.5.

My Testing Log: I ran this on an RTX 3090. Using the FP8 text encoder for detailed portraits stripped away micro-details and film grain. Switching the text encoder to FP16 and locking the CFG to 7.5 instantly restored high-fidelity skin textures and pores.

How to Fix the Speed Regression

To fix the BitDance speed regression, change your Attention Mode from ‘Eager’ to ‘SDPA’ or ‘Flash_Attn_2’. A format mismatch in the text encoder or using unoptimized attention forces the GPU into fallback operations, making generations run three times slower.

Step 1: Open the BitDance Loader node settings.

Step 2: Locate the attention_mode dropdown menu.

Step 3: Change the setting from eager (the default) to sdpa (Scaled Dot Product Attention).

My Testing Log: I tested different attention routing on a 24GB card. Leaving the attention mode on default/eager caused a massive speed regression. Switching the node to sdpa fixed the bottleneck and cut generation times down by 60%.

How to Prevent OOM (Out of Memory) Errors

To prevent CUDA Out of Memory errors with BitDance, use the model_to_offload input on the Text Encode node. This automatically moves the massive 14B diffusion model off your GPU before processing text, keeping total usage under the 12GB VRAM limit.

Step 1: Find your main BitDance Loader node on the canvas.

Step 2: Drag a connection from the main model output directly to the model_to_offload pin on your Text Encode node.

Step 3: Ensure your batch_size on the Empty Latent node is strictly set to 1.

My Testing Log: I ran this on an RTX 3090. Attempting to encode text while the 14B model was in memory caused an immediate OOM crash. Connecting the main model to the model_to_offload pin successfully cleared the VRAM just before text encoding, preventing the error entirely. Setting the batch size to 4 also caused an OOM error, but batch size 1 worked flawlessly.

Workflow

https://aistudynow.com/wp-content/uploads/2026/02/BitDance_Comfyui_workflow.json

Share This Article
Studied Computer Science. Passionate about AI, ComfyUI workflows, and hands-on learning through trial and error. Creator of AIStudyNow — sharing tested workflows, tutorials, and real-world experiments. Dev.to and GitHub.
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *