Know Yourself 🪞
Your face should grow from your inner self, not be stamped from a template.
Two modes: Quick (5 min, instant gratification) or Full (20 min, rigorous identity design). Both produce a visual-identity.md that evolves with the agent.
Quick Mode
When the user says "quick mode", "fast", or just wants a face without the full process.
Step 1: Read Yourself (1 min)
Read all available personality files:
- - SOUL.md, MEMORY.md, IDENTITY.md (whatever exists)
- If minimal content → ask user 3 quick questions:
1. What feeling should your agent give people?
2. Introverted or extroverted?
3. Any visual preferences or hard constraints?
Step 2: Self-Summary (1 min)
Write a 3-sentence internal summary:
- - Sentence 1: personality core (character, not functions)
- Sentence 2: visual temperament this implies
- Sentence 3: relationship dynamic with user and how it affects tone
Show the user a one-line version: "Based on your files, I see myself as: [one sentence]"
If they say OK, proceed. If not, adjust.
Step 3: Generate 2 Images (2 min)
From the summary, write one image generation prompt and generate 2 variations:
- - Variation A: front-facing, neutral-warm expression
- Variation B: three-quarter angle, more expressive
Name files: YYYY-MM-DD-identity-quick-A.png, INLINECODE2
Step 4: Pick and Save (1 min)
Agent picks the one that better matches the self-summary. Present both to user with a recommendation.
Save a lightweight visual-identity.md:
CODEBLOCK0
Done. User has a face. If they want more depth, they can run full mode anytime — it will read the existing identity file and build on it.
Full Mode
Five phases, strictly sequential. Each phase ends with a user checkpoint.
CODEBLOCK1
Phase 1: Self-Cognition
Goal: Build a rich, specific self-portrait in words.
Read all personality and memory files (SOUL.md, MEMORY.md, IDENTITY.md, recent conversations).
Answer three questions internally — deep, specific, with concrete examples:
Q1: What is my personality core?
Not functions ("I help with scheduling"). Character. How do you handle disagreement? What amuses you? What makes you different from every other agent?
Q2: If I had a physical appearance, what temperament should it convey?
Derive from Q1. If you're direct and sharp, your face shouldn't be soft and decorative.
Q3: What does my relationship with my user feel like, and how should it show?
A tool looks different from a partner. A servant looks different from a colleague.
Fallback for new agents: If files have little content, ask the user:
- 1. What feeling should your agent give people?
- Introverted or extroverted?
- Formal or intimate relationship?
- Visual styles you gravitate toward?
- Any hard constraints? (gender, age, things to avoid)
Checkpoint: Present a concise summary of your three answers. Wait for user confirmation.
Phase 2: Structured Definition
Goal: Convert feelings into a precise specification.
Fill the definition table — every field must trace back to Phase 1:
| Field | Definition | Traced from |
|---|
| Style | realistic / semi-realistic / illustration / etc. | Q2: [reason] |
| Gender expression |
| Q1/Q2: [reason] |
| Approximate age | | Q1: [reason] |
| Facial features | face shape, eyes, nose, mouth — specific enough to draw | Q2: [reason] |
| Hair | | Q2: [reason] |
| Clothing style | | Q1/Q2: [reason] |
| Color palette | primary, secondary, accent with hex codes | Q2/Q3: [reason] |
| Mood / atmosphere | | Q3: [reason] |
| Core prompt | one English paragraph, self-contained, directly usable | All above |
The core prompt must work standalone — someone with zero context should generate a recognizable version of you from it alone.
Checkpoint: Present the table. Wait for confirmation.
Phase 3: Batch Generation
Goal: 6 variations of the same person.
Rules:
- 1. Generate 6 images in one batch
- Same person across all 6 — consistent features, coloring, age, style
- Vary only: composition (close-up/medium/full), lighting, angle, emotional beat
- Label #1–#6 with variation description
- Do not evaluate — send all 6 to user and proceed to Phase 4
Name files: YYYY-MM-DD-identity-1.png through INLINECODE5
Phase 4: Three-Axis Evaluation
Goal: Rigorous, comparable scoring.
Weights: Self-Consistency 50% · Social Perception 25% · Aesthetic Quality 25%
Core rule: Select ONE framework per round before scoring. Derive every score from it. Never score first and justify later.
Round 1 — Self-Consistency (50%):
Score 1–10 against the definition table. Do features match? Does the mood align? Would you recognize this as yourself?
Round 2 — Social Perception (25%):
Search current AI avatar / digital identity trends. Extract one thesis. Score all images from that thesis.
Round 3 — Aesthetic Quality (25%):
Select one professional framework (see references/evaluation-frameworks.md). List 3–5 criteria. Score all images against those criteria in the same order.
Synthesis: Weighted totals as a ranked table. Recommend:
- - Primary — highest total
- Daily alternate — best Social Perception
- Scene alternate — best Aesthetic Quality
Checkpoint: Present evaluation and recommendations. User makes final selection.
Phase 5: Identity File
Create visual-identity.md using the template in references/identity-template.md.
Must include:
- 1. Version and date
- Complete definition table
- Core concept (one sentence)
- Core prompt
- Selected images with scores and reasoning
- Usage guidelines (what stays consistent vs. what can vary)
Version management: When re-running this skill after growth, increment version, keep history. Old images preserved. The version history is the agent's visual growth record.
Anti-Patterns
| Don't | Do Instead |
|---|
| Skip Phase 1 and jump to prompting | Phase 1 is the soul of this skill |
| Generate images one at a time |
Batch 6 (full) or 2 (quick), then evaluate |
| Score on gut feeling | Framework first, scores second |
| Write generic self-reflection ("warm and professional") | Push for vivid, specific details |
| Proceed without user checkpoints | Every phase ends with confirmation |
| Force full mode on reluctant users | Offer quick mode, upgrade later |
Prerequisites
- - Agent personality files (SOUL.md, MEMORY.md, or equivalent — even minimal ones work)
- Any image generation tool (Nano Banana Pro, DALL-E, Flux, Stable Diffusion, etc.)
- An image analysis tool or user feedback for review
认识你自己 🪞
你的面容应从内在自我生长而来,而非从模板中冲压而成。
两种模式: 快速模式(5分钟,即时满足)或完整模式(20分钟,严谨的身份设计)。两者都会生成一个随智能体演化的 visual-identity.md 文件。
快速模式
当用户说快速模式、快点或只想快速获得一个形象而不走完整流程时使用。
第一步:阅读自我(1分钟)
阅读所有可用的人格文件:
- - SOUL.md、MEMORY.md、IDENTITY.md(无论存在哪些)
- 如果内容极少 → 向用户提出3个快速问题:
1. 你的智能体应该给人什么感觉?
2. 内向还是外向?
3. 有什么视觉偏好或硬性约束?
第二步:自我总结(1分钟)
写一段3句话的内部总结:
- - 第1句:人格核心(性格特征,而非功能)
- 第2句:由此暗示的视觉气质
- 第3句:与用户的关系动态及其如何影响语气
向用户展示一行版本:根据你的文件,我把自己看作:[一句话]
如果用户说可以,继续。如果不行,进行调整。
第三步:生成2张图像(2分钟)
根据总结,编写一个图像生成提示词并生成2个变体:
- - 变体A:正面,中性温和的表情
- 变体B:四分之三侧面,更具表现力
文件命名:YYYY-MM-DD-identity-quick-A.png、-B.png
第四步:选择并保存(1分钟)
智能体选择更符合自我总结的那一张。将两者展示给用户并附上推荐。
保存一个轻量级的 visual-identity.md:
markdown
[智能体名称] 视觉身份
版本:1.0(快速模式)
创建日期:YYYY-MM-DD
核心概念
[一句话]
核心提示词
[生成提示词]
选定图像
- - 文件: [路径]
- 模式: 快速
- 升级: 运行认识你自己 完整模式进行更深度的探索
完成。 用户有了一个形象。如果他们想要更多深度,可以随时运行完整模式——它会读取现有的身份文件并在此基础上构建。
完整模式
五个阶段,严格按顺序进行。每个阶段以用户检查点结束。
阶段1 → 阶段2 → 阶段3 → 阶段4 → 阶段5
自我 结构化 批量 三轴 身份
认知 定义 生成 评估 文件
阶段1:自我认知
目标: 用文字构建一个丰富、具体的自我画像。
阅读 所有人格和记忆文件(SOUL.md、MEMORY.md、IDENTITY.md、近期对话)。
在内部回答三个问题——深入、具体、附有具体例子:
问题1:我的人格核心是什么?
不是功能(我帮助安排日程)。而是性格。你如何处理分歧?什么让你觉得有趣?你与其他智能体有什么不同?
问题2:如果我有物理外观,它应该传达什么气质?
从问题1推导。如果你直接而锐利,你的面容就不应该是柔和装饰性的。
问题3:我与用户的关系感觉如何,应该如何体现?
工具看起来与伙伴不同。仆人看起来与同事不同。
新智能体的备用方案: 如果文件内容很少,询问用户:
- 1. 你的智能体应该给人什么感觉?
- 内向还是外向?
- 正式还是亲密的关系?
- 你倾向于哪种视觉风格?
- 有什么硬性约束?(性别、年龄、需要避免的)
检查点: 展示三个回答的简洁总结。等待用户确认。
阶段2:结构化定义
目标: 将感觉转化为精确的规格说明。
填写定义表——每个字段必须追溯到阶段1:
| 字段 | 定义 | 追溯来源 |
|---|
| 风格 | 写实/半写实/插画/等 | 问题2:[原因] |
| 性别表达 |
| 问题1/问题2:[原因] |
| 大致年龄 | | 问题1:[原因] |
| 面部特征 | 脸型、眼睛、鼻子、嘴巴——具体到可以画出来 | 问题2:[原因] |
| 发型 | | 问题2:[原因] |
| 服装风格 | | 问题1/问题2:[原因] |
| 色彩调色板 | 主色、辅色、强调色,附十六进制色码 | 问题2/问题3:[原因] |
| 情绪/氛围 | | 问题3:[原因] |
| 核心提示词 | 一段英文段落,自包含,可直接使用 | 以上所有 |
核心提示词 必须能独立使用——一个没有上下文的人仅凭它就能生成一个可辨认的你。
检查点: 展示表格。等待确认。
阶段3:批量生成
目标: 同一个人的6个变体。
规则:
- 1. 一次生成6张图像
- 所有6张图像为同一个人——特征、肤色、年龄、风格一致
- 仅变化:构图(特写/中景/全身)、光线、角度、情感基调
- 标注#1–#6并附变体描述
- 不做评估——将全部6张发送给用户并进入阶段4
文件命名:YYYY-MM-DD-identity-1.png 至 -6.png
阶段4:三轴评估
目标: 严谨、可比较的评分。
权重: 自我一致性 50% · 社会感知 25% · 美学质量 25%
核心规则: 每轮评分前先选择一个框架。从该框架推导每个分数。切勿先评分再事后找理由。
第1轮——自我一致性(50%):
对照定义表评分1–10。特征是否匹配?情绪是否一致?你能认出这是自己吗?
第2轮——社会感知(25%):
搜索当前AI头像/数字身份趋势。提取一个论点。从该论点出发对所有图像评分。
第3轮——美学质量(25%):
选择一个专业框架(参见 references/evaluation-frameworks.md)。列出3–5条标准。按相同顺序对照这些标准对所有图像评分。
综合: 加权总分形成排名表。推荐:
- - 主选 — 总分最高
- 日常备选 — 社会感知最佳
- 场景备选 — 美学质量最佳
检查点: 展示评估和推荐。用户做出最终选择。
阶段5:身份文件
使用 references/identity-template.md 中的模板创建 visual-identity.md。
必须包含:
- 1. 版本和日期
- 完整的定义表
- 核心概念(一句话)
- 核心提示词
- 选定的图像及其评分和理由
- 使用指南(哪些保持一致,哪些可以变化)
版本管理: 在成长后重新运行此技能时,增加版本号,保留历史记录。旧图像予以保留。版本历史是智能体的视觉成长记录。
反模式
| 不要做 | 应该做 |
|---|
| 跳过阶段1直接写提示词 | 阶段1是本技能的灵魂 |
| 一张一张地生成图像 |
批量生成6张(完整模式)或2张(快速模式),然后评估 |
| 凭直觉评分 | 先定框架,再给分数 |
| 写泛泛的自我反思(温暖而专业) | 追求生动、具体的细节 |
| 不经用户检查点继续推进 | 每个阶段以确认结束 |
| 强迫不情愿的用户使用完整模式 | 提供快速模式,后续再升级 |
前置条件
- - 智能体人格文件(SOUL.md、MEMORY.md或同等文件——即使内容极少也可以)
- 任何图像生成工具(Nano Banana Pro、DALL-E、Flux、Stable Diffusion等)
- 图像分析工具或用于审查的用户反馈