What if you could bring your child's LEGO creations to life with just a photo and some AI magic? In this DeepDive, we're combining five powerful open-source tools β WAN 2.1, ATI, CoTracker, SAM2, and VACE β to create a complete animated shot from scratch.
π― The mission: VFX supervisor notes included
Our VFX supervisor (7 years old) had specific requirements:
- Make the spaceship launch with stylized thruster effects
- Have the mini spacecraft detach and fly away
- Add smoke effects on the table
- Make the characters react to the launch
Challenge accepted! Let's dive into how we achieved each of these using open-source AI tools in ComfyUI.
π Stage 1: WAN + ATI trajectory animation
WAN 2.1 serves as our foundation model, while ATI (Any Trajectory Instruction) lets you draw paths and watch objects follow them. The beauty is in its simplicityβsketch where things should go, and the model creates realistic motion.
Key workflow insights:
Multi-spline coordination: The updated spline editor allows multiple trajectories in one node, but they share timing. For independent timing, use separate spline editors and combine with concatenation.
Syntax matters: Single splines use one set of brackets [{coordinates}]
, while multiple splines need nested brackets [[{spline1}], [{spline2}]]
. Understanding this prevents errors when combining.
Motion timing tricks:
- Use "speed" method for non-uniform point distribution
- Cluster control points for slower motion sections
- Adjust points_to_sample for different trajectory durations
Pro tip: Prompt engineering helps guide the output β combine it with the ATI tracks for better results.
π― Stage 2: CoTracker precision tracking
CoTracker gives us motion tracking without manual keyframing. For our thruster effect, we needed a single tracking point that follows the spaceship.
Implementation details:
Tracking masks: Use input masks to limit tracking to specific regions. This prevents the algorithm from getting distracted by other moving objects.
Parameter optimization:
grid_size
: Start coarse (20x20) for coverageconfidence_threshold
: 0.9 filters unreliable tracksmin_distance
: Prevents point clusteringmax_num_points
: Force single-point tracking when needed
Memory management: Enable force_offload
to free VRAM after tracking completes. Essential when chaining multiple heavy operations.
π Stage 3: SAM2 intelligent segmentation
Segment Anything 2 creates masks for our characters and spaceship. These masks serve dual purposes: protecting regions during inpainting and creating holdout mattes.
Segmentation strategies:
Model selection matters: SAM2.1 performed better than base SAM2 for our LEGO scene. When one model fails, try alternatives before adding more tracking points.
Precision settings: Switching from BF16 to FP16 resolved single-frame artifacts. Small precision changes can have significant quality impacts.
Interactive refinement:
- Shift+Click: Add positive points
- Shift+Right-Click: Add negative points
- Balance point density: Too many can confuse the model
π₯ Stage 4: VACE dynamic inpainting
VACE brings our static reference artwork to life, transforming 2D drawings into animated fire and smoke effects.
Critical VACE insights:
Mask preparation: VACE requires 50% gray (0.5 value) for inpainting regions. Use SolidMask node to generate proper values.
Quality optimization:
- BF16 models over FP8 for final renders
- Sage Attention for faster generation
- Adjust blocks_to_swap based on available VRAM
Reference image alignment: Ensure your drawn effects match the initial frame positioning. VACE uses this as a guide throughout the sequence.
π‘ Workflow optimization tips
Memory management:
- Use targeted groups to organize your workflow with "set to never" between stages
- Save intermediate results to disk (PNG sequences preserve alpha)
- Use blocks_to_swap with Vace
- Clear model cache between major operations
Debugging strategies:
- Preview nodes at each stage to catch issues early
- Use ImageAndMaskPreview to debug mask output
- Use Preview Any to debug output from nodes
Performance tuning:
- Enable torch.compile where compatible
- Profile VRAM usage and adjust accordingly
π¨ Creative considerations
The balance between control and freedom is crucial. ATI gives precise trajectory control, but allow the model freedom for natural behaviors. VACE interprets your reference art, but prompt engineering helps guide the output.
π Beyond LEGO
While we focused on toy animation, these techniques apply broadly:
- Product demonstrations with dynamic effects
- Architectural visualizations with moving elements
- Social media content with eye-catching motion...
The workflow's modular nature means you can swap components β use different inputs, masks and inpainting techniques based on your needs.
π¦ Getting started
All workflows and input assets are available on our GitHub.
Key requirements:
- ComfyUI with latest updates
- 16GB+ VRAM recommended
- Patience for experimentation
Remember: it takes multiple attempts to create a great shot. Embrace the iterative process, and soon you'll be bringing any static scene to life with your new super (LEGO) powers.