Voice Memo Sync 🎙️

Intelligent voice/video transcription and organization system.
智能语音/视频转录与整理系统。

Quick Start / 快速开始

CODEBLOCK0

What it does / 安装内容:

1. Creates data directory memory/voice-memos/ / 创建数据目录
Creates config file config/voice-memo-sync.yaml / 创建配置文件
Creates Apple Notes folder "Voice Memos" / 创建 Apple Notes 文件夹
Checks dependencies and prompts installation / 检查依赖并提示安装

When to Use / 何时使用

✅ USE this skill when user:

- Sends voice/audio/video files / 发送语音/音频/视频文件
Sends YouTube/Bilibili URLs / 发送 YouTube/B站链接
Sends transcript text files / 发送转录文本文件
Says "sync voice memos", "process recording", "organize this video"
说「同步语音备忘录」「处理录音」「整理这个视频」

❌ DO NOT use when:

- User just wants to play audio/video / 用户只想播放音视频
User asks about music/podcasts without transcription needs / 询问音乐/播客但不需要转录

Supported Formats / 支持格式

⚡ Metal GPU Acceleration (NEW)

On Apple Silicon, whisper-cpp provides 15-20x faster transcription:

Audio	CPU (openai-whisper)	Metal GPU (whisper-cpp)
5 min	~5 min	~20 sec
30 min

~30 min | ~2 min |
| 60 min | ~60 min | ~4 min |

CODEBLOCK1

The skill auto-detects and uses Metal when available.

Type / 类型	Formats / 格式	Processing / 处理方式
Voice Memos	.qta, .m4a	Apple native (QTA metadata) → Whisper fallback
Audio

Processing Pipeline / 处理流程

CODEBLOCK2

Data Structure / 数据结构

CODEBLOCK3

Apple Notes Output Format / 输出格式

The skill reads USER.md, SOUL.md, and MEMORY.md to provide personalized analysis:

- Deep insights tailored to user's research/work focus
Connections to active projects and ongoing interests
Actionable recommendations based on user's decision style
Critical thinking that challenges assumptions

处理时会读取 USER.md、SOUL.md 和 MEMORY.md 提供个性化分析：

- 结合用户研究/工作重点的深度洞察
与活跃项目和持续关注领域的关联
基于用户决策风格的行动建议
挑战假设的批判性思考

CODEBLOCK4

QTA File Format / QTA文件格式 (Technical Reference)

Apple Voice Memos on iOS/macOS 14+ uses .qta (QuickTime Audio) files that embed native transcription directly in the file metadata.

Structure

CODEBLOCK5

Transcription JSON Format

CODEBLOCK6

Key Points:

- runs array alternates: INLINECODE11
INLINECODE12 provides timestamps for each character
JSON is embedded raw in the ilst/data atom
Use extract-apple-transcript.py to reliably extract

Extraction Script

CODEBLOCK7

Common Issues
Issue Cause Solution
"未找到转录数据" Recording still processing Wait 1-2 min, or use Whisper
"转录标记存在但数据不完整"
Partial transcription | Use Whisper fallback |

Issue	Cause	Solution
"未找到转录数据"	Recording still processing	Wait 1-2 min, or use Whisper
"转录标记存在但数据不完整"

| JSON parse error | Corrupted file | Try Whisper transcription |

Location / 位置: INLINECODE15

CODEBLOCK8

Scripts / 脚本

Script	Purpose / 用途	Usage / 用法
INLINECODE16	Initialize setup	INLINECODE17
INLINECODE18

Agent Processing Guide / Agent处理指南

When user sends audio/video or URL, follow these steps:
当用户发送音视频或URL时，按以下步骤处理：

Step 1: Detect Input Type / 识别输入类型

CODEBLOCK9

Step 2: Save Source Info / 保存源信息

CODEBLOCK10

Step 3: Get/Save Transcript / 获取保存转录

CODEBLOCK11

Step 4: LLM Deep Processing / LLM深度整理

CODEBLOCK12

Step 5: Save Processed Result / 保存处理结果

CODEBLOCK13

Step 6: Sync to Apple Notes (MANDATORY) / 同步到Apple Notes（必须执行）

⚠️ CRITICAL: This step is MANDATORY. Never skip it.
⚠️ 关键：此步骤必须执行，不可跳过。

⚠️ Apple Notes requires HTML format, NOT Markdown!
⚠️ Apple Notes 需要 HTML 格式，不能直接用 Markdown！

Correct workflow / 正确流程:
CODEBLOCK14

Common mistakes to avoid / 常见错误:

- ❌ Writing raw Markdown to Apple Notes → 乱码/格式错误
❌ Using memo notes -a interactively → 无法自动化
❌ Skipping this step entirely → 其他设备看不到
✅ Always convert MD → HTML via pandoc first
✅ Always verify the note was created successfully

Step 7: Create Reminders / 创建提醒

CODEBLOCK15

Step 8: Update INDEX.md / 更新索引

# Append record to memory/voice-memos/INDEX.md

Privacy / 隐私说明

⚠️ Privacy-First Design:

- All transcription runs locally by default / 所有转录默认在本地完成
Apple native transcripts extracted from local files / Apple原生转录从本地文件提取
Whisper runs locally / Whisper在本地运行
No data sent to external servers (unless user explicitly configures external API)
User data stored only in local memory directory

Troubleshooting / 故障排除

Whisper not found

CODEBLOCK17

yt-dlp download fails

CODEBLOCK18

Apple Notes folder not created

CODEBLOCK19

Transcription quality issues

# Use larger model for better accuracy
# Edit config: whisper_model: "medium" or "large"

Changelog / 更新日志

v1.6.1 (2026-03-09)

- CRITICAL FIX: Apple Notes sync step marked as MANDATORY (不可跳过).
FORMAT FIX: Explicit requirement to convert Markdown → HTML via pandoc before syncing.
Added complete AppleScript template with folder creation.
Common mistakes checklist to prevent format issues.

v1.6.0 (2026-03-09)

- QTA Format Documentation: Added detailed technical reference for Apple's QTA file format.
Enhanced extract-apple-transcript.py v1.1: Improved JSON boundary detection, better error diagnostics, timestamp extraction support.
Added --with-timestamps option for detailed time-aligned output.
Better handling of large files (>100MB).

v1.5.0 (2026-03-09)

- Added Mode C: Lecture/Talk (single speaker, argument structure extraction).
Added Mode D: Lecture + Q&A (hybrid processing).
Added Mode E: Long-form No-Speaker-Label (> 90min, topic-based chunking).
Introduced Two-Pass Processing for content > 60 min.
Added Output Density Levels (Executive / Structured / Full Annotated).

v1.4.0 (2026-03-09)

- Introduced "Deep Meeting Mode" for content > 15min or multi-speaker.
Preserves information density for critical discussions/interviews.
New structure: Executive Summary + Chronological Detail + Debate Flow + Decision Matrix.
Explicit attribution of quotes and arguments.

v1.2.0 (2026-03-08)

- Added unified processing script process.sh / 新增统一处理脚本
Added installation script install.sh / 新增安装脚本
Unified data storage to memory/voice-memos/ / 统一数据存储
Added .doc/.docx/.json/.csv support / 新增文档格式支持
Bilingual SKILL.md / 中英双语SKILL.md
Improved INDEX.md auto-update / 完善索引自动更新

v1.1.0 (2026-03-08)

- Added iCloud directory sync / 新增iCloud目录同步
Added YouTube/Bilibili support / 新增YouTube/B站支持
Added text file processing / 新增文本文件处理

v1.0.0 (2026-03-08)

- Initial release / 初始版本
Apple Voice Memos transcription / Apple语音备忘录转录
Apple Notes sync / Apple Notes同步

Voice Memo Sync 🎙️

Intelligent voice/video transcription and organization system.
智能语音/视频转录与整理系统。

Quick Start / 快速开始

bash

Run installation script / 运行安装脚本

cd ~/.openclaw/workspace/skills/voice-memo-sync
./scripts/install.sh

What it does / 安装内容:

1. Creates data directory memory/voice-memos/ / 创建数据目录
Creates config file config/voice-memo-sync.yaml / 创建配置文件
Creates Apple Notes folder Voice Memos / 创建 Apple Notes 文件夹
Checks dependencies and prompts installation / 检查依赖并提示安装

When to Use / 何时使用

✅ USE this skill when user:

- Sends voice/audio/video files / 发送语音/音频/视频文件
Sends YouTube/Bilibili URLs / 发送 YouTube/B站链接
Sends transcript text files / 发送转录文本文件
Says sync voice memos, process recording, organize this video
说「同步语音备忘录」「处理录音」「整理这个视频」

❌ DO NOT use when:

- User just wants to play audio/video / 用户只想播放音视频
User asks about music/podcasts without transcription needs / 询问音乐/播客但不需要转录

Supported Formats / 支持格式

⚡ Metal GPU Acceleration (NEW)

On Apple Silicon, whisper-cpp provides 15-20x faster transcription:

Audio	CPU (openai-whisper)	Metal GPU (whisper-cpp)
5 min	~5 min	~20 sec
30 min

~30 min | ~2 min |
| 60 min | ~60 min | ~4 min |

bash

Install for Metal acceleration (recommended)

brew install whisper-cpp

The skill auto-detects and uses Metal when available.

Type / 类型	Formats / 格式	Processing / 处理方式
Voice Memos	.qta, .m4a	Apple native (QTA metadata) → Whisper fallback
Audio

Processing Pipeline / 处理流程

Input (File/URL/Text)
│
▼
┌─────────────────────────────────────┐
│ 1. Source Detection │
│ 来源识别 │
│ Voice Memo / URL / File / Text │
└─────────────────┬───────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 2. Save Source Metadata │
│ 保存源信息 │
│ → memory/voice-memos/sources/ │
└─────────────────┬───────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 3. Transcription │
│ 转录提取 │
│ Priority: Apple > Text > summarize│
│ > Whisper-local > API │
└─────────────────┬───────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 4. Save Raw Transcript │
│ 保存原始转录 │
│ → memory/voice-memos/transcripts/ │
└─────────────────┬───────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 5. LLM Deep Processing │
│ LLM深度整理 │
│ • Read USER.md & MEMORY.md │
│ • Clean up spoken language │
│ • Extract key points & insights │
│ • Identify TODOs & connections │
└─────────────────┬───────────────────┘
│
▼
┌─────────────────────────────────────┐
│ 6. Save Processed Result │
│ 保存处理结果 │
│ → memory/voice-memos/processed/ │
└─────────────────┬───────────────────┘
│
┌───────┴───────┐
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ 7a. Apple Notes │ │ 7b. Reminders │
│ Structured note │ │ Create TODOs │
│ with #hashtags │ │ 创建提醒 │
└────────┬────────┘ └────────┬───────┘
│ │
└─────────┬─────────┘
▼
┌─────────────────────────────────────┐
│ 8. Update Index │
│ 更新索引 │
│ → memory/voice-memos/INDEX.md │
└─────────────────────────────────────┘

Data Structure / 数据结构

memory/voice-memos/ # All data, searchable via memory_search
├── INDEX.md # Processing records index / 处理记录索引
├── sources/ # Original file metadata / 原始文件元数据
│ └── YYYY-MM-DD_xxx.json
├── transcripts/ # Raw transcripts / 原始转录文本
│ └── YYYY-MM-DDsourcetitle.md
├── processed/ # LLM processed content / LLM处理后内容
│ └── YYYY-MM-DDsourcetitle.md
└── synced/ # Sync records / 同步记录
└── YYYY-MM-DDsourcetitle.json

Apple Notes Output Format / 输出格式

The skill reads USER.md, SOUL.md, and MEMORY.md to provide personalized analysis:

- Deep insights tailored to users research/work focus
Connections to active projects and ongoing interests
Actionable recommendations based on users decision style
Critical thinking that challenges assumptions

处理时会读取 USER.md、SOUL.md 和 MEMORY.md 提供个性化分析：

- 结合用户研究/工作重点的深度洞察
与活跃项目和持续关注领域的关联
基于用户决策风格的行动建议
挑战假设的批判性思考

🎙️ [Auto-generated Title / 智能生成的标题]

📅 Date | ⏱️ Duration | 👤 Source
🏷️ #tag1 #tag2 #tag3

━━━━━━━━━━━━━━━━━━━━━━

📌 Summary / 核心摘要
[One paragraph summarizing the content]

🎯 Key Points / 关键要点
• Point 1
• Point 2
• Point 3

💡 Deep Analysis & Reflection (For User) / 深度分析与反思
[Personalized analysis connecting to users:
- Current research directions (from MEMORY.md)
- Active projects and interests (from USER.md)
- Decision-making style and preferences
- Critical counter-arguments and blind spots]

📋 Action Items / 行动建议
☐ Research: [specific to users academic work]
☐ Business: [relevant to startup/investment focus]
☐ Content: [ideas for courses/articles]

🔗 Related Connections / 相关联系
• Connection to [project/memory]
• Recommended reading/research

💬 Notable Quotes / 金句摘录
• Quote 1
• Quote 2

━━━━━━━━━━━━━━━━━━━━━━

📝 Original Transcript (Cleaned) / 原始转录（已整理）
[Full transcript text, cleaned up from spoken language / 完整转录，已整理口语表达]

QTA File Format / QTA文件格式 (Technical Reference)

Apple Voice Memos on iOS/macOS 14+ uses .qta (QuickTime Audio) files that embed native transcription directly in the file metadata.

Structure

QTA File
├── ftyp (file type marker: qt )
├── wide (extended marker)
├── mdat (audio data, typically 90%+ of file size)
└── moov (metadata container)
├── mvhd (movie header)
└── trak (one or more tracks)
├── tkhd (track header)
├── mdia (media data)
└── meta (metadata - TRANSCRIPTION HERE!)
├── hdlr (handler: mdta)
├── keys (key list: com.apple.VoiceMemos.tsrp)
└── ilst (data list)

voice-memo-sync语音备忘同步

voice-memo-sync

Voice Memo Sync 🎙️

Quick Start / 快速开始

When to Use / 何时使用

Supported Formats / 支持格式

⚡ Metal GPU Acceleration (NEW)

Processing Pipeline / 处理流程

Data Structure / 数据结构

Apple Notes Output Format / 输出格式

QTA File Format / QTA文件格式 (Technical Reference)

Structure

Transcription JSON Format

Extraction Script

Common IssuesIssueCauseSolution"未找到转录数据"Recording still processingWait 1-2 min, or use Whisper"转录标记存在但数据不完整" Partial transcription | Use Whisper fallback |

Scripts / 脚本

Agent Processing Guide / Agent处理指南

Step 1: Detect Input Type / 识别输入类型

Step 2: Save Source Info / 保存源信息

Step 3: Get/Save Transcript / 获取保存转录

Step 4: LLM Deep Processing / LLM深度整理

Step 5: Save Processed Result / 保存处理结果

Step 6: Sync to Apple Notes (MANDATORY) / 同步到Apple Notes（必须执行）

Step 7: Create Reminders / 创建提醒

Step 8: Update INDEX.md / 更新索引

Privacy / 隐私说明

Troubleshooting / 故障排除

Whisper not found

yt-dlp download fails

Apple Notes folder not created

Transcription quality issues

Changelog / 更新日志

v1.6.1 (2026-03-09)

v1.6.0 (2026-03-09)

v1.5.0 (2026-03-09)

v1.4.0 (2026-03-09)

v1.2.0 (2026-03-08)

v1.1.0 (2026-03-08)

v1.0.0 (2026-03-08)

Voice Memo Sync 🎙️

Quick Start / 快速开始

Run installation script / 运行安装脚本

When to Use / 何时使用

Supported Formats / 支持格式

⚡ Metal GPU Acceleration (NEW)

Install for Metal acceleration (recommended)

Processing Pipeline / 处理流程

Data Structure / 数据结构

Apple Notes Output Format / 输出格式

QTA File Format / QTA文件格式 (Technical Reference)

Structure

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement

Common Issues
Issue Cause Solution
"未找到转录数据" Recording still processing Wait 1-2 min, or use Whisper
"转录标记存在但数据不完整"
Partial transcription | Use Whisper fallback |