ComfyUI Video Generation
Automate AI video generation using ComfyUI + LTX-2.3 model. Ideal for music video (MV) production, multi-scene batch rendering, and AI video content creation.
Requirements
| Item | Spec |
|---|
| GPU | ≥24GB VRAM (Turing/Ampere/Ada) |
| ComfyUI |
0.17+ |
| PyTorch | 2.6+cu124 |
| Access | SSH tunnel forwarding port 18188 |
Model Setup
| Model | Size | Path |
|---|
| LTX-2.3 dev (bf16) | 43GB | INLINECODE0 |
| Gemma 3 12B |
23GB |
models/text_encoders/comfy_gemma_3_12B_it.safetensors |
| Distilled LoRA | 7.1GB |
models/loras/ltxv/ltx2/ltx-2.3-22b-distilled-lora-384.safetensors |
| Video VAE (bf16) | - |
models/vae/LTX23_video_vae_bf16.safetensors |
Turing GPUs (e.g., Quadro RTX 8000) do NOT support fp8_e4m3fn. Use bf16/fp16 models only.
Performance Baseline
CODEBLOCK0
Key insight: Frame count does NOT affect total time. Bottleneck is model forward pass.
Workflow Node Reference
| Node | ID | Purpose |
|---|
| LoadImage | 2004 | I2V reference input |
| CLIPTextEncode (positive) |
2483 | Positive prompt |
| CLIPTextEncode (negative) | 2612 | Negative prompt |
| EmptyLTXVLatentVideo | 3059 | Empty latent |
| LTXVScheduler | 4966 | Steps/length params |
| LoraLoaderModelOnly | 4922+ | LoRA loader |
| SaveVideo | 4823/4852 | Output mp4 |
Quick Start
Generate a Single Video (I2V)
- 1. Load workflow: INLINECODE5
- Set params using INLINECODE6
- Click Run
- Wait ~1 hour
- Download from INLINECODE7
Batch Scene Generation
Use scripts/batch_scenes.js for automation:
CODEBLOCK1
Step Count Guide
| Steps | Quality | Time/Scene | Use Case |
|---|
| 8 | Rough | ~30min | Quick preview |
| 15 |
Good | ~57min |
Recommended sweet spot |
| 25 | Best | ~1h45m | Final quality output |
I2V + LoRA at 15 steps achieves ~90% of 25-step quality with 40% less time.
Troubleshooting
VAEDecode Validation Failed
Error: Exception when validating node: 'VAEDecode'
Cause: VAE load timing or insufficient VRAM
Fix: Reload the entire workflow (fetch + loadGraphData), wait for models to fully load, then run. Never reload during execution.
Browser Tab Lost
Cause: SSH tunnel disconnected
Fix:
- 1. Rebuild tunnel: INLINECODE10
- Navigate to ComfyUI
- Reload workflow
Inconsistent Characters Across Scenes
Cause: Different reference images per scene
Fix: Use the SAME reference image for all scenes. Extract a clear frame from an existing video if needed. The I2V input image dictates the visual baseline.
Output Video Not Saved
Check: ssh -p PORT root@HOST "ls -lht /workspace/ComfyUI/output/*.mp4"
Fix: Check for VAEDecode errors in log, then re-run.
Monitoring Progress
CODEBLOCK2
Best Practices
- 1. 15 steps is the sweet spot — I2V converges at 15-20 steps, 25 has diminishing returns
- Unified reference image — Same input image for all scenes ensures character consistency
- Reload workflow every time — Avoids VAEDecode validation failures
- Never reload during execution — Current run will fail
- Frame selection — 72 frames (3s) for testing, 480 frames (20s) for final output
- VRAM management — Wait for each generation to complete before starting next
T2V vs I2V Comparison
| Mode | Steps | Quality | Notes |
|---|
| T2V (no LoRA) | 15 | ❌ Very blurry | Not recommended |
| I2V + LoRA |
25 | ✅ Excellent | Major quality improvement |
| I2V + LoRA | 15 | ✅ Very good | Best time/quality ratio |
Conclusion: I2V + LoRA is the recommended combination.
Resources
- -
scripts/batch_scenes.js — Batch scene automation - INLINECODE13 — Full node ID mapping
- INLINECODE14 — Prompt tips, VRAM management, optimization
ComfyUI 视频生成
使用ComfyUI + LTX-2.3模型实现AI视频生成自动化。适用于音乐视频(MV)制作、多场景批量渲染和AI视频内容创作。
系统要求
| 项目 | 规格 |
|---|
| GPU | ≥24GB显存(图灵/安培/艾达架构) |
| ComfyUI |
0.17及以上版本 |
| PyTorch | 2.6+cu124 |
| 访问方式 | SSH隧道转发端口18188 |
模型配置
| 模型 | 大小 | 路径 |
|---|
| LTX-2.3 dev(bf16) | 43GB | models/checkpoints/ltx-2.3-22b-dev.safetensors |
| Gemma 3 12B |
23GB | models/text
encoders/comfygemma
312B_it.safetensors |
| 蒸馏LoRA | 7.1GB | models/loras/ltxv/ltx2/ltx-2.3-22b-distilled-lora-384.safetensors |
| 视频VAE(bf16) | - | models/vae/LTX23
videovae_bf16.safetensors |
图灵架构GPU(如Quadro RTX 8000)不支持fp8_e4m3fn格式。请仅使用bf16/fp16模型。
性能基准
每步耗时:约221秒(恒定值,与帧数无关!)
15步:约57分钟
25步:约1小时45分钟
帧数:72帧=3秒,121帧=5秒,480帧=20秒(24fps)
关键发现:帧数不影响总耗时。瓶颈在于模型前向传播。
工作流节点参考
| 节点 | ID | 用途 |
|---|
| LoadImage | 2004 | I2V参考输入 |
| CLIPTextEncode(正向) |
2483 | 正向提示词 |
| CLIPTextEncode(负向) | 2612 | 负向提示词 |
| EmptyLTXVLatentVideo | 3059 | 空潜变量 |
| LTXVScheduler | 4966 | 步数/长度参数 |
| LoraLoaderModelOnly | 4922+ | LoRA加载器 |
| SaveVideo | 4823/4852 | 输出mp4文件 |
快速入门
生成单个视频(I2V)
- 1. 加载工作流:/workspace/ComfyUI/customnodes/ComfyUI-LTXVideo/exampleworkflows/2.3/LTX-2.3T2VI2VSingleStageDistilledFull.json
- 使用scripts/batch_scenes.js设置参数
- 点击运行
- 等待约1小时
- 从/workspace/ComfyUI/output/下载
批量场景生成
使用scripts/batch_scenes.js实现自动化:
javascript
// 先加载脚本,然后配置每个场景:
await comfyui_batch.configureScene({
name: scene_01,
prompt: 一个孤独的女孩在夜晚雨中奔跑,霓虹灯光倒影,
image: unified_ref.png,
steps: 15,
frames: 72
});
// 点击运行,重复操作下一个场景
步数选择指南
| 步数 | 质量 | 每场景耗时 | 适用场景 |
|---|
| 8 | 粗糙 | 约30分钟 | 快速预览 |
| 15 |
良好 | 约57分钟 |
推荐最佳平衡点 |
| 25 | 最佳 | 约1小时45分钟 | 最终质量输出 |
I2V + LoRA在15步时能达到25步质量的约90%,同时节省40%时间。
故障排除
VAEDecode验证失败
错误信息:Exception when validating node: VAEDecode
原因:VAE加载时机问题或显存不足
解决方法:重新加载整个工作流(fetch + loadGraphData),等待模型完全加载后再运行。执行过程中切勿重新加载。
浏览器标签页丢失
原因:SSH隧道断开连接
解决方法:
- 1. 重建隧道:ssh -f -N -L 18188:localhost:18188 user@host -p port
- 导航到ComfyUI
- 重新加载工作流
跨场景角色不一致
原因:每个场景使用不同的参考图像
解决方法:所有场景使用相同的参考图像。如有需要,从现有视频中提取清晰帧。I2V输入图像决定了视觉基准。
输出视频未保存
检查:ssh -p PORT root@HOST ls -lht /workspace/ComfyUI/output/*.mp4
解决方法:检查日志中的VAEDecode错误,然后重新运行。
监控进度
bash
当前采样进度
ssh -p PORT root@HOST grep it/s /tmp/comfy.log | tail -1
完成检查
ssh -p PORT root@HOST grep Prompt executed /tmp/comfy.log | tail -1
输出文件
ssh -p PORT root@HOST ls -lht /workspace/ComfyUI/output/*.mp4
最佳实践
- 1. 15步是最佳平衡点 — I2V在15-20步收敛,25步收益递减
- 统一参考图像 — 所有场景使用相同输入图像确保角色一致性
- 每次重新加载工作流 — 避免VAEDecode验证失败
- 执行期间切勿重新加载 — 当前运行将会失败
- 帧数选择 — 测试用72帧(3秒),最终输出用480帧(20秒)
- 显存管理 — 等待每次生成完成后再开始下一次
T2V与I2V对比
| 模式 | 步数 | 质量 | 备注 |
|---|
| T2V(无LoRA) | 15 | ❌ 非常模糊 | 不推荐 |
| I2V + LoRA |
25 | ✅ 优秀 | 质量大幅提升 |
| I2V + LoRA | 15 | ✅ 非常好 | 最佳时间/质量比 |
结论:推荐使用I2V + LoRA组合。
资源
- - scripts/batchscenes.js — 批量场景自动化
- references/workflownodes.md — 完整节点ID映射
- references/tips.md — 提示词技巧、显存管理、优化建议