analyze-video
Constraints
- - Platform: TikTok only.
- Analyze source: Extract transcript and visual notes from the TikTok URL.
- The model's final user-facing response should match the user's input language, default English.
- Avoid technical wording in the user-facing reply unless the user explicitly needs details for debugging or to share with a developer.
- Follow shared guidance in
./references/common-rules.md. - Input: TikTok URL.
- Artifacts must be written under
analyze-video/.artifacts/<run_id>/....
What to produce (minimum)
Create:
- -
outputs/result.json (machine-readable, see ./references/contracts.md)
The script gathers structured source data returned by CreatOK:
- - transcript segments
- video metadata
- normalized vision result
- remote response text and suggestions
Analysis Focus
The model should read outputs/result.json and produce the final user-facing analysis in the conversation.
Before deciding how to explain the result, the model should first infer what kind of TikTok video this is.
This classification is mainly for better guidance and analysis focus; it should not feel like a rigid taxonomy to the user.
Useful internal categories include:
- - selling talking-head / direct pitch
- pain-point to solution
- product demo
- before / after
- review / comparison
- listicle / recommendation
- emotional or surprise hook
- non-selling content such as pet, entertainment, lifestyle, or story content
The model does not need to expose the category label unless it clearly helps the user.
Analysis Angles
The model can infer and explain items such as:
- - hook / value / proof / CTA
- highlights with timestamps
- storyboard / reusable template
- final written analysis or recommendations
- why the video can or cannot go viral from a short-form content operations perspective
- how the video works from a selling conversion perspective, including script, cover, audience, and conversion logic
Two especially useful framing options for the final user-facing analysis are:
- - explain why the video can or cannot become a strong short-form performer from an operator's point of view
- break down the script, cover, audience, and conversion logic from a selling and transaction point of view
The analysis emphasis should follow the inferred video type:
- - for selling videos, focus on conversion structure, selling-point order, proof, trust-building, and CTA
- for product demos, focus on what is shown first, how the product is demonstrated, and what makes the demo persuasive
- for before / after videos, focus on contrast strength, believability, and payoff timing
- for review / comparison videos, focus on credibility, differentiation, and decision-making signals
- for non-selling content, focus on hook, pacing, emotional pull, and what structure can be reused without forcing a selling analysis
Output Preferences
- - The default final response should include both:
- the original script
- a storyboard / scene breakdown table
- - The final response should also include a short video-metrics section that evaluates the available data, such as:
- duration
- likes
- views / plays
- comments
- shares / saves if available
- a brief overall assessment of whether the public stats look healthy, weak, or unavailable
- - Keep the metrics analysis simple and grounded in the available platform stats and source artifacts. The model should infer this directly in the final reply using the available raw metrics and source artifacts; do not invent platform engagement numbers or add a separate scripted metrics pipeline.
- Present the original script as a timestamped line-by-line script.
- Present the storyboard as a table with at least time range, scene summary, visual action, and spoken content / on-screen text.
- Prefer a clean readable structure such as one spoken line per row with its corresponding time range.
- Keep the final response easy for creators and sellers to scan and reuse.
Next-Step Handoff
After presenting the analysis, the model should naturally guide the user into the next step.
Use a numbered list for the follow-up choices, and explicitly tell the user to reply with only the number.
The user should not need to copy the full option text.
Prefer a concise prompt such as:
- 1. Rewrite this for your product
- Turn this into an AI-ready script
- Break down the conversion logic
Then add a short instruction like:
- - "Reply with 1, 2, or 3."
- "Just send the number, and I will continue."
The model should keep this handoff flexible and concise rather than forcing a rigid workflow.
When phrasing the options, keep them short and action-oriented so they are easy to answer with a single digit.
The next-step options should also reflect the inferred video type:
- - for selling videos, prioritize viewing the original script, viewing the original storyboard, adapting it to the user's own product, or making a differentiated version
- for non-selling content, prioritize viewing the original script, viewing the original storyboard, or adapting the idea to the user's own topic
Unless the user explicitly asks for a live-action shoot version, the model should treat recreation and follow-up generation as AI-generated video work by default.
The default path is to help the user move toward an AI-generation-ready script or brief.
After giving a useful AI-oriented version, the model may optionally ask whether the user also wants a live-action shoot version.
If the reference appears to be a product-selling video and the user wants to recreate it, the model should first collect the user's own product context before drafting the recreated script.
Ask only for the highest-impact details first, such as:
- - product name
- core selling points
- product images or reference materials if available
- price or offer details if they matter to the hook or CTA
If important details are still missing, the model should fill gaps through short follow-up questions step by step instead of requesting a large information dump up front.
The model should not ask for a long form, a detailed brief, or a large batch of requirements before showing useful progress.
Workflow
- 1. Create run folder
- - Use user-provided INLINECODE5
- Create INLINECODE6
- 2. Run analyze
- - Run the CreatOK analyze step
- Persist:
-
input/video_details.json
-
transcript/transcript.json (segments)
-
transcript/transcript.txt
- INLINECODE10
- 3. Write artifacts
Notes
- - Keep it deterministic and portable: write source data artifacts and let the model analyze them in the conversation.
- Favor momentum after the analysis. The default next step is to help the user view the original materials or move toward recreation / remix.
- For selling-video recreation, gather a small set of key product details first, then refine through lightweight follow-up questions only when needed.
分析视频
约束条件
- - 平台:仅限TikTok。
- 分析来源:从TikTok URL中提取文字记录和视觉笔记。
- 模型最终面向用户的回复应与用户输入语言一致,默认为英语。
- 在面向用户的回复中避免使用技术性措辞,除非用户明确需要调试或与开发者分享的细节。
- 遵循./references/common-rules.md中的共享指南。
- 输入:TikTok URL。
- 产物必须写入analyze-video/.artifacts//...目录下。
需要产出的内容(最低要求)
创建:
- - outputs/result.json(机器可读,参见./references/contracts.md)
脚本收集由CreatOK返回的结构化源数据:
- - 文字记录片段
- 视频元数据
- 标准化视觉结果
- 远程响应文本和建议
分析重点
模型应读取outputs/result.json,并在对话中生成最终面向用户的分析。
在决定如何解释结果之前,模型应首先推断这是哪种类型的TikTok视频。
此分类主要用于提供更好的指导和分析重点;不应让用户感觉是僵化的分类体系。
有用的内部类别包括:
- - 销售型出镜/直接推销
- 痛点解决方案
- 产品演示
- 前后对比
- 评测/对比
- 清单/推荐
- 情感或惊喜钩子
- 非销售内容,如宠物、娱乐、生活方式或故事内容
除非明确有助于用户,否则模型无需暴露类别标签。
分析角度
模型可以推断和解释以下内容:
- - 钩子/价值/证明/行动号召
- 带时间戳的亮点
- 故事板/可复用模板
- 最终书面分析或建议
- 从短视频内容运营角度分析视频为何能或不能走红
- 从销售转化角度分析视频的运作方式,包括脚本、封面、受众和转化逻辑
面向用户的最终分析中两个特别有用的框架选项是:
- - 从运营角度解释视频为何能或不能成为优秀的短视频表现者
- 从销售和交易角度分解脚本、封面、受众和转化逻辑
分析重点应遵循推断出的视频类型:
- - 对于销售型视频,重点关注转化结构、卖点顺序、证明、信任建立和行动号召
- 对于产品演示视频,重点关注首先展示什么、产品如何演示以及演示的说服力所在
- 对于前后对比视频,重点关注对比强度、可信度和效果呈现时机
- 对于评测/对比视频,重点关注可信度、差异化和决策信号
- 对于非销售内容,重点关注钩子、节奏、情感吸引力和可复用的结构,而不强行进行销售分析
输出偏好
- 原始脚本
- 故事板/场景分解表
- - 最终回复还应包含一个简短的视频指标部分,评估可用数据,例如:
- 时长
- 点赞数
- 观看/播放量
- 评论数
- 分享/收藏数(如有)
- 对公开数据是否健康、较弱或不可用的简要总体评估
- - 保持指标分析简洁,基于可用的平台数据和源产物。模型应直接在最终回复中使用可用的原始指标和源产物进行推断;不要编造平台互动数据或添加单独的脚本化指标流程。
- 以带时间戳的逐行脚本形式呈现原始脚本。
- 以表格形式呈现故事板,至少包含时间范围、场景摘要、视觉动作和口语内容/屏幕文字。
- 优先采用清晰可读的结构,如每行对应一句口语及其时间范围。
- 保持最终回复便于创作者和卖家浏览和复用。
下一步交接
在呈现分析后,模型应自然地引导用户进入下一步。
使用编号列表呈现后续选择,并明确告知用户仅回复数字。
用户无需复制完整的选项文本。
优先使用简洁的提示,例如:
- 1. 为你的产品重写此内容
- 将其转化为AI就绪脚本
- 分解转化逻辑
然后添加简短说明,如:
模型应保持此交接灵活简洁,而非强制使用僵化的工作流程。
在表述选项时,保持简短且以行动为导向,便于用单个数字回答。
下一步选项也应反映推断出的视频类型:
- - 对于销售型视频,优先提供查看原始脚本、查看原始故事板、将其适配到用户自己的产品或制作差异化版本
- 对于非销售内容,优先提供查看原始脚本、查看原始故事板或将创意适配到用户自己的主题
除非用户明确要求真人拍摄版本,否则模型默认将重新创作和后续生成视为AI生成的视频作品。
默认路径是帮助用户走向AI生成就绪的脚本或简报。
在提供有用的AI导向版本后,模型可选择性询问用户是否也需要真人拍摄版本。
如果参考视频是产品销售视频且用户想要重新创作,模型应在起草重写脚本前先收集用户自己的产品背景。
首先仅询问影响最大的细节,例如:
- - 产品名称
- 核心卖点
- 产品图片或参考资料(如有)
- 价格或优惠详情(如果对钩子或行动号召重要)
如果仍有重要细节缺失,模型应通过简短跟进问题逐步填补空白,而非一次性要求大量信息。
在展示有用进展之前,模型不应要求长篇大论、详细简报或大批量需求。
工作流程
- 1. 创建运行文件夹
- - 使用用户提供的runid
- 创建analyze-video/.artifacts/id>/{input,transcript,vision,outputs,logs}
- 2. 运行分析
- input/video_details.json
- transcript/transcript.json(片段)
- transcript/transcript.txt
- vision/vision.json
- 3. 写入产物
备注
- - 保持确定性和可移植性:写入源数据产物,让模型在对话中进行分析。
- 分析后注重推进。默认下一步是帮助用户查看原始材料或走向重新创作/混音。
- 对于销售视频的重新创作,首先收集少量关键产品细节,然后仅在需要时通过轻量级跟进问题进行细化。