🎬 Video Generator Skill
Automated text-to-video generation system that transforms text scripts into professional short videos with AI-powered voiceover, precise timing, and cyber-wireframe visuals.
Cost: ~$0.003 per 15-second video | License: MIT | Package: openclaw-video-generator
📦 Package Information
| Property | Value |
|---|
| npm Package | INLINECODE0 |
| Version |
1.6.2 |
| Repository |
github.com/ZhenRobotics/openclaw-video-generator |
| Commit Hash |
6279034 |
| License | MIT |
Verification:
npm info openclaw-video-generator version repository.url
# Expected: 1.6.2 and https://github.com/ZhenRobotics/openclaw-video-generator
🔐 Provider Setup (Choose ONE)
This tool supports 4 alternative TTS/ASR providers. You only need ONE configured:
Option 1: OpenAI (Recommended)
export OPENAI_API_KEY="sk-..."
- - Pros: Best quality, simple setup
- Cost: ~$0.003 per 15s video
Option 2: Azure
export AZURE_SPEECH_KEY="..."
export AZURE_SPEECH_REGION="eastasia"
- - Pros: Enterprise reliability
- Cost: Similar to OpenAI
Option 3: Aliyun (阿里云)
export ALIYUN_ACCESS_KEY_ID="..."
export ALIYUN_ACCESS_KEY_SECRET="..."
export ALIYUN_APP_KEY="..."
- - Pros: China connectivity, Chinese voices
- Cost: ~¥0.02 per 15s video
Option 4: Tencent (腾讯云)
export TENCENT_SECRET_ID="..."
export TENCENT_SECRET_KEY="..."
export TENCENT_APP_ID="..."
- - Pros: China connectivity
- Cost: ~¥0.02 per 15s video
Why multiple providers? Fallback support for network restrictions, regional preferences, and cost optimization.
🚀 Quick Start
Prerequisites
CODEBLOCK5
Installation
Option 1: npm Global Install
CODEBLOCK6
Option 2: From Source
CODEBLOCK7
First Video
cd ~/openclaw-video-generator
cat > test.txt << 'EOF'
AI makes development easier
Saving time and boosting efficiency
EOF
./scripts/script-to-video.sh test.txt --voice nova --speed 1.15
# Output: out/test.mp4
💻 Agent Usage
When to Use
Auto-trigger when user mentions:
video,
generate video,
create video, INLINECODE5
Standard Command
CODEBLOCK9
With Background Video
CODEBLOCK10
Example Flow
User: "Generate video: AI makes development easier"
Agent:
CODEBLOCK11
Guidelines
Do:
- - Verify project exists before running
- Check .env configuration
- Show output file location
Don't:
- - Clone without user confirmation
- Hardcode API keys in commands
- Create new Remotion projects
🎯 Core Features
- - Multi-Provider TTS: OpenAI, Azure, Aliyun, Tencent with auto-fallback
- Timestamp Extraction: Precise speech-to-text segmentation
- Scene Detection: 6 intelligent scene types with auto-styling
- Video Rendering: Remotion with cyber-wireframe aesthetics
- Background Videos: Custom backgrounds with opacity control
- Local Processing: Video rendering happens on your machine
⚙️ Configuration
TTS Voices
OpenAI:
- -
nova (recommended), alloy, echo, INLINECODE9
Azure:
- -
zh-CN-XiaoxiaoNeural, INLINECODE11
Speech Speed
Range: 0.25 - 4.0 | Recommended: 1.15
Background Video
- -
--bg-video <path> - Video file - INLINECODE13 - Transparency
- INLINECODE14 - Text overlay
Recommended:
| Use Case | Opacity | Overlay |
|---|
| Text-focused | 0.3-0.4 | INLINECODE15 |
| Balanced |
0.5-0.6 |
rgba(10,10,15,0.4) |
| Visual-focused | 0.7-1.0 |
rgba(10,10,15,0.25) |
📊 Video Specs
- - Resolution: 1080 x 1920 (vertical)
- Frame Rate: 30 fps
- Format: MP4 (H.264 + AAC)
- Style: Cyber-wireframe with neon colors
- Duration: Auto-calculated
🎨 Scene Types
| Type | Effect | Trigger |
|---|
| title | Glitch + scale | First segment |
| emphasis |
Pop-up zoom | Numbers/percentages |
| pain | Shake + warning | Problems mentioned |
| content | Fade-in | Regular text |
| circle | Rotating ring | Listed points |
| end | Slide-up | Last segment |
💰 Cost
Per 15-second video: ~$0.003 (< 1 cent)
- - TTS: ~$0.001
- Whisper: ~$0.0015
- Rendering: Free (local)
🔧 Troubleshooting
Project Not Found
CODEBLOCK12
API Key Error
CODEBLOCK13
Provider Test
cd ~/openclaw-video-generator && ./scripts/test-providers.sh
🔒 Privacy
Local Processing:
- - Video rendering
- Scene orchestration
- File management
Cloud Processing (via configured provider):
- - Text-to-Speech (text sent to API)
- Speech recognition (audio sent to API)
API keys are stored in .env file (600 permissions, never committed to git).
📚 Documentation
- - npm: https://www.npmjs.com/package/openclaw-video-generator
- GitHub: https://github.com/ZhenRobotics/openclaw-video-generator
- Issues: https://github.com/ZhenRobotics/openclaw-video-generator/issues
📊 Tech Stack
Remotion · OpenAI · Azure · Aliyun · Tencent · TypeScript · Node.js · FFmpeg
🆕 Version History
v1.6.2 (2026-03-25) - Current
- - Chinese TTS integration (Aliyun)
- Dual subtitle styles
- Medical content examples
v1.6.0 (2026-03-18)
- - Premium styles system
- Poster generator
- Design tokens
v1.2.0 (2026-03-07)
- - Background video support
- Multi-provider architecture
- Auto-fallback
v1.0.0 (2026-03-03)
License: MIT |
Author: @ZhenStaff |
Support:
GitHub Issues
🎬 视频生成器技能
自动化文本转视频生成系统,可将文字脚本转化为带有AI配音、精准计时和赛博线框视觉风格的专业短视频。
成本:每15秒视频约$0.003 | 许可证:MIT | 包:openclaw-video-generator
📦 包信息
| 属性 | 值 |
|---|
| npm包 | openclaw-video-generator |
| 版本 |
1.6.2 |
| 仓库 |
github.com/ZhenRobotics/openclaw-video-generator |
| 提交哈希 | 6279034 |
| 许可证 | MIT |
验证:
bash
npm info openclaw-video-generator version repository.url
预期输出:1.6.2 和 https://github.com/ZhenRobotics/openclaw-video-generator
🔐 提供商设置(任选其一)
本工具支持 4种备选TTS/ASR提供商。您只需配置其中一种:
选项1:OpenAI(推荐)
bash
export OPENAI
APIKEY=sk-...
- - 优点:质量最佳,设置简单
- 成本:每15秒视频约$0.003
选项2:Azure
bash
export AZURE
SPEECHKEY=...
export AZURE
SPEECHREGION=eastasia
选项3:阿里云
bash
export ALIYUN
ACCESSKEY_ID=...
export ALIYUN
ACCESSKEY_SECRET=...
export ALIYUN
APPKEY=...
- - 优点:中国网络连接,中文语音
- 成本:每15秒视频约¥0.02
选项4:腾讯云
bash
export TENCENT
SECRETID=...
export TENCENT
SECRETKEY=...
export TENCENT
APPID=...
- - 优点:中国网络连接
- 成本:每15秒视频约¥0.02
为什么提供多个提供商? 用于网络限制、区域偏好和成本优化的故障转移支持。
🚀 快速开始
前提条件
bash
node --version # 需要 >= 18
npm --version
ffmpeg -version
安装
选项1:npm全局安装
bash
npm install -g openclaw-video-generator@1.6.2
export OPENAIAPIKEY=sk-... # 或添加到 ~/.bashrc
openclaw-video-generator --version
选项2:从源码安装
bash
git clone https://github.com/ZhenRobotics/openclaw-video-generator.git
cd openclaw-video-generator
npm install
配置提供商
cp .env.example .env
nano .env # 添加您的API密钥
chmod 600 .env
第一个视频
bash
cd ~/openclaw-video-generator
cat > test.txt << EOF
AI让开发更简单
节省时间,提升效率
EOF
./scripts/script-to-video.sh test.txt --voice nova --speed 1.15
输出:out/test.mp4
💻 代理使用
何时使用
当用户提及以下关键词时自动触发:video、generate video、create video、生成视频
标准命令
bash
cd ~/openclaw-video-generator && \
./scripts/script-to-video.sh <脚本文件> \
--voice nova \
--speed 1.15
带背景视频
bash
cd ~/openclaw-video-generator && \
./scripts/script-to-video.sh <脚本文件> \
--voice nova \
--bg-video backgrounds/tech.mp4 \
--bg-opacity 0.6
示例流程
用户:生成视频:AI让开发更简单
代理:
bash
1. 检查项目
ls ~/openclaw-video-generator || echo 未安装
2. 创建脚本
cat > ~/openclaw-video-generator/scripts/user-script.txt << EOF
AI让开发更简单
EOF
3. 生成
cd ~/openclaw-video-generator && \
./scripts/script-to-video.sh scripts/user-script.txt
4. 显示结果
echo 视频:~/openclaw-video-generator/out/user-script.mp4
指南
应做:
- - 运行前确认项目存在
- 检查.env配置
- 显示输出文件位置
不应做:
- - 未经用户确认就克隆
- 在命令中硬编码API密钥
- 创建新的Remotion项目
🎯 核心功能
- - 多提供商TTS:OpenAI、Azure、阿里云、腾讯云,支持自动故障转移
- 时间戳提取:精确的语音转文本分段
- 场景检测:6种智能场景类型,自动样式化
- 视频渲染:采用赛博线框美学的Remotion
- 背景视频:自定义背景,支持透明度控制
- 本地处理:视频渲染在您的机器上完成
⚙️ 配置
TTS语音
OpenAI:
- - nova(推荐)、alloy、echo、shimmer
Azure:
- - zh-CN-XiaoxiaoNeural、zh-CN-YunxiNeural
语速
范围:0.25 - 4.0 | 推荐:1.15
背景视频
- - --bg-video <路径> - 视频文件
- --bg-opacity <0-1> - 透明度
- --bg-overlay - 文字叠加
推荐设置:
| 使用场景 | 透明度 | 叠加层 |
|---|
| 文字为主 | 0.3-0.4 | rgba(10,10,15,0.6) |
| 平衡模式 |
0.5-0.6 | rgba(10,10,15,0.4) |
| 视觉为主 | 0.7-1.0 | rgba(10,10,15,0.25) |
📊 视频规格
- - 分辨率:1080 x 1920(竖屏)
- 帧率:30 fps
- 格式:MP4(H.264 + AAC)
- 风格:赛博线框搭配霓虹色彩
- 时长:自动计算
🎨 场景类型
| 类型 | 效果 | 触发条件 |
|---|
| title | 故障+缩放 | 第一段 |
| emphasis |
弹出放大 | 数字/百分比 |
| pain | 抖动+警告 | 提及问题 |
| content | 淡入 | 常规文字 |
| circle | 旋转环 | 列举要点 |
| end | 上滑 | 最后一段 |
💰 成本
每15秒视频:约$0.003(不到1美分)
- - TTS:约$0.001
- Whisper:约$0.0015
- 渲染:免费(本地)
🔧 故障排除
项目未找到
bash
ls ~/openclaw-video-generator || \
git clone https://github.com/ZhenRobotics/openclaw-video-generator.git ~/openclaw-video-generator && \
cd ~/openclaw-video-generator && npm install
API密钥错误
bash
验证.env
cat ~/openclaw-video-generator/.env
如果缺失则创建
cd ~/openclaw-video-generator
echo OPENAI
APIKEY=sk-... > .env
chmod 600 .env
提供商测试
bash
cd ~/openclaw-video-generator && ./scripts/test-providers.sh
🔒 隐私
本地处理:
云端处理(通过配置的提供商):
- - 文本转语音(文本发送至API)
- 语音识别(音频发送至API)
API密钥存储在.env文件中(600权限,永不提交至git)。
📚 文档
- - npm:https://www.npmjs.com/package/openclaw-video-generator
- GitHub:https://github.com/ZhenRobotics/openclaw-video-generator
- 问题反馈:https://github.com/ZhenRobotics/openclaw-v