How to Use SCAIL-2 in ComfyUI (No Skeleton Workflow)

Esha Sharma
4 Min Read
Auto Draft Step 1

SCAIL-2 is a powerful new tool for ComfyUI. It allows you to copy motion from a video and apply it to an image perfectly.

The biggest advantage is that you do not need to use a pose skeleton. Because it removes the skeleton, you can easily animate non-human objects, like 2D illustrations and 3D models. It captures the movement naturally.

Download the Required Models

Please click these links to download the files, and place them in the exact folders listed below:

  1. Main Model:
  2. VAE Model:
  3. Text Encoder:
  4. CLIP Vision:
  5. Mask Model:
  6. Optional DPO LoRA (Fixes hands and faces):
  7. Lightx2v

Setting Up the SCAIL-2 Masking System

To get these perfect results, you must use a strict masking system.

Download the SAM 3.1 Multiplex File First, you need the right masking model. Download the SAM 3.1 Multiplex FP16 file. Save this file directly inside your models/checkpoints folder in ComfyUI.

Connect the Colored Mask Nodes Next, open your ComfyUI workflow. You need two SAM3 Video Track nodes. Connect the first node to your image, and connect the second node to your video.

Then, add the Create SCAIL-2 Colored Mask node. Send the image output into the ref_track_data input. Send the video output into the driving_track_data input. Finally, use a Load Checkpoint node to select the SAM 3.1 masking model you downloaded earlier.

How to Fix Bad Hands with a Distilled LoRA

AI video models often generate bad hands. You can fix this completely by adding a distilled LoRA file to your workflow.

I downloaded the rank 128 version for my test. However, if your computer has low VRAM (like an 8GB card), you should use the rank 64 version instead. This will help prevent out-of-memory crashes.

In my test, I used an image of a flamingo to replace a woman in a video. The LoRA worked perfectly. The flamingo even had five fingers, and the AI captured every single movement from the original video.

Generating Longer Videos with For-Loop Nodes

SCAIL-2 works best with short, four-second segments. If you try to generate a longer video all at once, you will get a bad result.

To make a longer video, you must add a For Loop Start node and a For Loop End node. These nodes continue the animation safely by dividing your video into small, four-second segments.

Choosing Your Video Background

You might want to use the background from the original video instead of the background from your image. To do this, simply enable the image background remover in your workflow. This gives you exact control over your final scene.

Share This Article
Studied Computer Science. Passionate about AI, ComfyUI workflows, and hands-on learning through trial and error. Creator of AIStudyNow — sharing tested workflows, tutorials, and real-world experiments. Dev.to and GitHub.
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *