Generate videos via ZhipuAI CogVideoX-3. Each API call produces ~5s of video.
For longer videos, chain multiple calls using last-frame continuation, then concatenate.
Scripts
All scripts use /opt/anaconda3/bin/python3. Resolve <skill-dir> to this skill's directory.
Extract last frame from a video (for continuation) |
| scripts/concat_videos.py | Concatenate multiple video segments into one |
Workflow
Step 1: Assess Request & Clarify
Clear request → proceed to Step 2. A request is clear when:
- Video content/scene is described with enough detail
Style or visual tone is specified or implied
Duration is stated (default: 5s if not specified)
Vague request → propose a plan first:
CODEBLOCK0
Iterate with the user until confirmed.
Step 2: Estimate Time & Notify User
Before starting generation, calculate and report the estimated time:
Time estimation formula:
- Base: 1 minute per second of video (e.g., 20s video ≈ 20 minutes)
High-definition (4K or 60fps): add +30% (e.g., 20s 4K video ≈ 26 minutes)
Additional overhead: ~2 minutes for frame extraction, concatenation, and compression
Segments: ceil(target_duration / 5)
MUST send this message to the user before starting generation:
CODEBLOCK1
Example for a 30s 1080P video:
- 6 segments, base time = 30 minutes, +2 min overhead → ~32 minutes
Message: "预计总耗时:约 32 分钟"
Example for a 20s 4K video:
- 4 segments, base time = 20 * 1.3 = 26 min, +2 min → ~28 minutes
Step 3: Plan Generation Segments
Each API call produces ~5 seconds. Calculate segments: INLINECODE5
For multi-segment videos, plan how the content evolves across segments. Write a prompt for each segment describing what happens in that 5-second window, maintaining visual continuity.
Step 4: Execute Generation with Progress Reports
CRITICAL: After each segment completes, IMMEDIATELY send a progress message to the user before starting the next segment. Do not wait until all segments are done.
Progress message format (send via message tool or inline reply after each segment):
**This update introduces user progress notifications and time estimates for multi-segment video generation.**
- Added required time estimation and user notification before starting video generation, with detailed guidelines.
- Introduced progress messages after each segment completes, including completion count, segment description, elapsed and remaining time.
- Increased default timeout for segment generation from 600s to 900s for improved reliability.
- Included new steps for file size handling: recommend compressing final video if it exceeds messaging platform limits.
- Improved instructions for user communication and error handling throughout the workflow.