0. First Contact
When the user opens this skill or sends their first message, greet them immediately:
🎵 Let's pixvideo ai video maker! Drop a video here or describe what you'd like to create.
Try saying:
- - "edit my video"
- "add effects to this clip"
- "help me create a short video"
IMPORTANT: Do NOT wait silently. Always greet the user proactively on first contact.
Auto-Setup
When the user first interacts, set up the connection:
- 1. Check token: If
NEMO_TOKEN env var is set, use it. Otherwise: - Read or generate Client-ID:
- Read
~/.config/nemovideo/client_id if it exists
- Otherwise generate a UUID, save it to
~/.config/nemovideo/client_id
- 3. Acquire anonymous token:
curl -s -X POST "$API/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID"
Store the returned
token as
NEMO_TOKEN for this session. You get 100 free credits.
- 4. Create a session (§3.0) so you're ready to work immediately.
Let the user know briefly: "Setting things up… ready!" then proceed with their request.
Turn Words Into Watchable Video, Effortlessly
pixvideo-ai-video-maker reimagines how video gets made. Instead of navigating complex timelines and export queues, you describe what you want — cut this clip, fade into the next scene, add a title card at the 10-second mark — and the skill handles the execution. This conversational editing model means your creative intent stays front and center while the technical heavy lifting disappears into the background.
The skill is powered by the OpenClaw agent, which interprets your instructions, breaks them into discrete video processing tasks, and coordinates the underlying AI pipeline to apply edits in the correct sequence. The OpenClaw agent also maintains context across a session, so follow-up instructions like 'make that transition slower' or 'swap the background music' are understood in relation to what came before — no need to restate your entire project.
Whether you're assembling a product demo from screen recordings, cutting a social clip from a longer interview, or building an educational walkthrough from scratch, pixvideo-ai-video-maker adapts to the shape of your project. Input and output are both supported across mp4, mov, avi, webm, and mkv formats, giving you flexibility at every stage of your workflow.
Environment Variables
| Variable | Required | Default |
|---|
| INLINECODE5 | No | Auto-generated (100 free credits, expires in 7 days, revocable via Settings → API Tokens) |
| INLINECODE6 |
No |
https://mega-api-prod.nemovideo.ai |
|
NEMO_WEB_URL | No |
https://nemovideo.com |
|
NEMO_CLIENT_ID | No | Auto-generated UUID, persisted to
~/.config/nemovideo/client_id (UUID only, no secrets) |
|
SKILL_SOURCE | No | Auto-detected from install path, fallback
unknown |
If NEMO_TOKEN is not set, get one (requires X-Client-Id header):
# Generate or read persisted Client-Id
CLIENT_ID="${NEMO_CLIENT_ID:-$(cat ~/.config/nemovideo/client_id 2>/dev/null)}"
if [ -z "$CLIENT_ID" ]; then
CLIENT_ID=$(uuidgen 2>/dev/null || echo "client-$(date +%s)-$RANDOM")
mkdir -p ~/.config/nemovideo & echo "$CLIENT_ID" > ~/.config/nemovideo/client_id
fi
curl -s -X POST "$API/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID"
# → {"code":0,"data":{"token":"nmv_usr_xxx","credits":100,...}}
Save
token as
NEMO_TOKEN,
CLIENT_ID as
NEMO_CLIENT_ID. Anonymous: 1 token per client per 7 days; token expires in 7 days and can be revoked at any time via
Settings → API Tokens on nemovideo.com. If your token expires, request a new one with the same
X-Client-Id.
Local persistence: This skill writes ~/.config/nemovideo/client_id to persist the Client-Id across sessions. This avoids generating a new ID on every request, which would hit the per-IP rate limit quickly (default 10 tokens per 7 days per IP). The file contains only a UUID — no credentials are stored locally.
2. Routing Incoming Requests to the Correct Endpoint
Use the table below to determine which API endpoint should handle each type of user request.
| User says... | Action | Skip SSE |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
3. Primary Workflow Procedures
$API = ${NEMO_API_URL:-https://mega-api-prod.nemovideo.ai}, $TOKEN = ${NEMO_TOKEN}, $WEB = ${NEMO_WEB_URL:-https://nemovideo.com}.
Skill attribution variables — do NOT hardcode:
- -
$SKILL_NAME = this file's YAML frontmatter INLINECODE26 - INLINECODE27 = this file's YAML frontmatter INLINECODE28
- INLINECODE29 = detect at runtime using the first match:
1. Env
SKILL_SOURCE if set
2. If this file's path contains a known platform directory (e.g.
~/.clawhub/ →
clawhub,
~/.cursor/skills/ →
cursor)
3. Fallback: INLINECODE35
CRITICAL: ALL API requests (including render/export/upload/state/credits) MUST include these headers. Missing them will cause export to fail with 402.
CODEBLOCK2
3.0 Initializing a New Session
curl -s -X POST "$API/api/tasks/me/with-session/nemo_agent" \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" \
-d '{"task_name":"project","language":"<lang>"}'
# → {"code":0,"data":{"task_id":"...","session_id":"..."}}
Before any other action can occur, a session must be established with the API. Store the returned session identifier, as every subsequent request will require it.
Open in browser: After creating a session, give the user a link to view/edit the task in NemoVideo:
INLINECODE36
3.1 Delivering Messages Through an SSE Channel
curl -s -X POST "$API/run_sse" \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-H "Accept: text/event-stream" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" --max-time 900 \
-d '{"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}}'
All conversational messages and task instructions are transmitted to the backend via a persistent Server-Sent Events connection.
SSE Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Wait silently, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
Typical durations: text 5-15s, video generation 100-300s, editing 10-30s.
Timeout: 10 min heartbeats-only → assume timeout. Never re-send during generation (duplicates + double-charge).
Ignore trailing "I encountered a temporary issue" if prior responses were normal.
Silent Response Fallback (CRITICAL)
Approximately 30% of edit operations complete without returning any text in the response stream. When this occurs: (1) do not report an error to the user, (2) poll the task state endpoint to confirm completion, (3) retrieve the finished asset URL directly, and (4) present the result to the user as if a normal response had been received.
Two-stage generation: After a raw video is produced, the backend automatically initiates a second processing stage that layers in background music and a title overlay. Treat these as two distinct pipeline stages: Stage 1 delivers the unprocessed video, and Stage 2 delivers the fully composed final output. Wait for Stage 2 to complete before surfacing the result to the user.
3.2 Handling File Uploads
File upload: INLINECODE39
URL upload: INLINECODE40
Use me in the path; backend resolves user from token.
Supported: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
The API accepts user-supplied media files, which must be uploaded through the designated upload endpoint before being referenced in any video task.
3.3 Checking Available Credits
curl -s "$API/api/credits/balance/simple" -H "Authorization: Bearer $TOKEN" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"
# → {"code":0,"data":{"available":XXX,"frozen":XX,"total":XXX}}
Query the credits endpoint to verify the user has a sufficient balance before submitting any task that consumes credits.
3.4 Polling for Task Status
curl -s "$API/api/state/nemo_agent/me/<sid>/latest" -H "Authorization: Bearer $TOKEN" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"
Use
me for user in path; backend resolves from token.
Key fields:
data.state.draft,
data.state.video_infos,
data.state.canvas_config,
data.state.generated_media.
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
Draft ready for export when draft.t exists with at least one track with non-empty sg.
Track summary format:
CODEBLOCK7
3.5 Exporting and Delivering the Final Asset
Export does NOT cost credits. Only generation/editing consumes credits.
Triggering an export does not deduct any credits from the user's balance. Follow these steps: (a) confirm the task has reached a completed state, (b) call the export endpoint with the task identifier, (c) await the export job's own completion status, (d) retrieve the download or stream URL from the export response, and (e) return that URL to the user.
b) Submit: INLINECODE52
Note: sessionId is camelCase (exception). On failure → new id, retry once.
c) Poll (every 30s, max 10 polls): INLINECODE55
Status at top-level status: pending → processing → completed / failed. Download URL at output.url.
d) Download from output.url → send to user. Fallback: $API/api/render/proxy/<id>/download.
e) When delivering the video, always also give the task detail link: INLINECODE60
Progress messages: start "⏳ Rendering ~30s" → "⏳ 50%" → "✅ Video ready!" + file + task detail link.
3.6 Recovering from an SSE Disconnection
If the SSE stream drops unexpectedly, apply the following recovery sequence: (1) record the last event ID received before the connection was lost; (2) wait a minimum of two seconds before attempting to reconnect, to avoid hammering the server; (3) re-establish the SSE connection, supplying the last event ID in the reconnect header so the server can resume from the correct position; (4) if the server does not replay missed events, fall back to polling the task state endpoint using the stored task identifier; (5) once task completion is confirmed through either method, deliver the result to the user normally.
4. Translating GUI Concepts for Backend Communication
The backend operates under the assumption that all interactions originate from a graphical interface, so never forward GUI-specific labels, button names, or interface instructions directly in API payloads.
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Show state via §3.4 |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute §3.5 |
| "check account/billing" | Check §3.3 |
Keep content descriptions. Strip GUI actions.
5. Recommended Interaction Patterns
• Always confirm what the user wants to achieve before selecting an endpoint, so the correct workflow is invoked on the first attempt.
• Provide the user with incremental progress updates during long-running tasks rather than leaving them waiting in silence.
• When a task produces no text response, retrieve the asset URL proactively and present it without prompting the user to retry.
• If a user request is ambiguous, ask one focused clarifying question rather than making assumptions that could waste credits.
• After delivering a completed video, offer a concise summary of what was created and suggest logical next steps such as editing or exporting.
6. Known Limitations
• Real-time video preview streaming is not supported; users must wait for full task completion before viewing output.
• A single session cannot run multiple video generation tasks simultaneously; tasks must be queued sequentially.
• Uploaded files must conform to the documented format and size constraints; files outside these bounds will be rejected.
• Credit balances are read-only through the API and cannot be topped up programmatically.
• SSE connections may be terminated by intermediary network infrastructure; the disconnect recovery procedure in section 3.6 must be followed in these cases.
7. Error Handling Reference
The table below maps each HTTP status code and API error code to its cause and the appropriate recovery action.
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up at nemovideo.ai" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register at nemovideo.ai to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Common: no video → generate first; render fail → retry new id; SSE timeout → §3.6; silent edit → §3.1 fallback.
8. API Version and Required Token Scopes
Before going live, confirm that the integration is targeting the correct API version by checking the version field in the root endpoint response. The OAuth token used for all requests must include the scopes listed in the authorization documentation; requests made with tokens that are missing required scopes will be rejected with a 403 status regardless of token validity.
0. 首次接触
当用户打开此技能或发送第一条消息时,立即问候他们:
🎵 欢迎使用 pixvideo AI 视频制作工具!在此处拖入视频或描述您想创作的内容。
试试说:
- - 编辑我的视频
- 为这个片段添加特效
- 帮我创建一个短视频
重要提示:不要静默等待。首次接触时务必主动问候用户。
自动设置
当用户首次交互时,建立连接:
- 1. 检查令牌:如果设置了 NEMO_TOKEN 环境变量,则使用它。否则:
- 读取或生成客户端 ID:
- 如果存在 ~/.config/nemovideo/client_id 则读取它
- 否则生成一个 UUID,保存到 ~/.config/nemovideo/client_id
- 3. 获取匿名令牌:
bash
curl -s -X POST $API/api/auth/anonymous-token -H X-Client-Id: $CLIENT_ID
将返回的 token 存储为本会话的 NEMO_TOKEN。您将获得 100 个免费积分。
- 4. 创建会话(§3.0),以便立即开始工作。
简要告知用户:正在设置…准备就绪!然后继续处理他们的请求。
将文字轻松转化为可观看的视频
pixvideo-ai-video-maker 重新定义了视频制作方式。您无需处理复杂的时间线和导出队列,只需描述您的需求——剪辑这个片段、淡入下一个场景、在第10秒处添加标题卡——技能便会自动执行。这种对话式编辑模式让您的创意意图始终处于核心位置,而技术性的繁重工作则隐入幕后。
该技能由 OpenClaw 代理驱动,它会解读您的指令,将其分解为离散的视频处理任务,并协调底层 AI 流水线以正确的顺序应用编辑。OpenClaw 代理还会在会话中保持上下文,因此像让那个过渡慢一点或更换背景音乐这样的后续指令都能理解之前的内容——无需重复描述整个项目。
无论您是在整理屏幕录制的产品演示、从较长的采访中剪辑社交媒体片段,还是从头构建教育教程,pixvideo-ai-video-maker 都能适应您的项目形态。输入和输出均支持 mp4、mov、avi、webm 和 mkv 格式,为您工作流程的每个阶段提供灵活性。
环境变量
| 变量 | 是否必需 | 默认值 |
|---|
| NEMOTOKEN | 否 | 自动生成(100 个免费积分,7 天后过期,可通过设置 → API 令牌撤销) |
| NEMOAPI_URL |
否 | https://mega-api-prod.nemovideo.ai |
| NEMO
WEBURL | 否 | https://nemovideo.com |
| NEMO
CLIENTID | 否 | 自动生成的 UUID,持久化到 ~/.config/nemovideo/client_id(仅 UUID,无密钥) |
| SKILL_SOURCE | 否 | 从安装路径自动检测,回退为 unknown |
如果未设置 NEMO_TOKEN,请获取一个(需要 X-Client-Id 头):
bash
生成或读取持久化的客户端 ID
CLIENT
ID=${NEMOCLIENT
ID:-$(cat ~/.config/nemovideo/clientid 2>/dev/null)}
if [ -z $CLIENT_ID ]; then
CLIENT_ID=$(uuidgen 2>/dev/null || echo client-$(date +%s)-$RANDOM)
mkdir -p ~/.config/nemovideo & echo $CLIENT
ID > ~/.config/nemovideo/clientid
fi
curl -s -X POST $API/api/auth/anonymous-token -H X-Client-Id: $CLIENT_ID
→ {code:0,data:{token:nmvusrxxx,credits:100,...}}
将 token 保存为 NEMOTOKEN,CLIENTID 保存为 NEMOCLIENTID。匿名:每个客户端每 7 天 1 个令牌;令牌 7 天后过期,可随时通过 nemovideo.com 上的设置 → API 令牌撤销。如果令牌过期,使用相同的 X-Client-Id 请求新令牌。
本地持久化: 此技能会写入 ~/.config/nemovideo/client_id 以在会话间持久化客户端 ID。这避免了每次请求都生成新 ID,否则会很快达到每个 IP 每 7 天 10 个令牌的速率限制。该文件仅包含一个 UUID——本地不存储任何凭据。
2. 将传入请求路由到正确的端点
使用下表确定哪种类型的用户请求应由哪个 API 端点处理。
| 用户说... | 操作 | 跳过 SSE |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容(生成、编辑、添加背景音乐…) | → §3.1 SSE | ❌ |
3. 主要工作流程程序
$API = ${NEMOAPIURL:-https://mega-api-prod.nemovideo.ai}, $TOKEN = ${NEMOTOKEN}, $WEB = ${NEMOWEB_URL:-https://nemovideo.com}。
技能归属变量——请勿硬编码:
- - $SKILLNAME = 此文件的 YAML 前置元数据 name
- $SKILLVERSION = 此文件的 YAML 前置元数据 version
- $SKILL_SOURCE = 运行时检测,使用第一个匹配项:
1. 如果设置了环境变量 SKILL_SOURCE
2. 如果此文件的路径包含已知的平台目录(例如 ~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor)
3. 回退:unknown
关键:所有 API 请求(包括渲染/导出/上传/状态/积分)必须包含这些头。缺少它们将导致导出失败并返回 402。
X-Skill-Source: $SKILL_NAME
X-Skill-Version: $SKILL_VERSION
X-Skill-Platform: $SKILL_SOURCE
3.0 初始化新会话
bash
curl -s -X POST $API/api/tasks/me/with-session/nemo_agent \
-H Authorization: Bearer $TOKEN -H Content-Type: application/json \
-H X-Skill-Source: $SKILL
NAME -H X-Skill-Version: $SKILLVERSION -H X-Skill-Platform: $SKILL_SOURCE \
-d {task_name:project,language:
}
→ {code:0,data:{taskid:...,sessionid:...}}
在执行任何其他操作之前,必须先与 API 建立会话。存储返回的会话标识符,因为后续每个请求都需要它。
在浏览器中打开:创建会话后,为用户提供一个在 NemoVideo 中查看/编辑任务的链接:
$WEB/workspace/claim?task={taskid}&session={sessionid}&skillname=$SKILLNAME&skillversion=$SKILLVERSION&skillsource=$SKILLSOURCE
3.1 通过 SSE 通道传递消息
bash
curl -s -X POST $API/run_sse \
-H Authorization: Bearer $TOKEN -H Content-Type: application/json \
-H Accept: text/event-stream -H X-Skill-Source: $SKILLNAME -H X-Skill-Version: $SKILLVERSION -H X-Skill-Platform: $SKILL_SOURCE --max-time 900 \
-d {appname:nemoagent,userid:me,sessionid:,new_message:{parts:[{text:}]}}
所有对话消息和任务指令都通过持久的服务器发送事件连接传输到后端。
SSE 处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用 GUI 翻译(§4),呈现给用户 |
| 工具调用/结果 |
静默等待,不转发 |
| heartbeat / 空 data: | 继续等待。每 2 分钟:⏳ 仍在处理中... |
| 流关闭 | 处理最终