Feishu Voice Loop

Provide a reusable three-step voice loop for OpenClaw:

1. accept text or voice input
generate speech with OpenAI TTS
return the audio to Feishu or a web player

When the input is voice, transcribe it to text first, then continue through the same output pipeline.

Quick start

Prerequisites:

- OPENAI_API_KEY is set for TTS
Feishu app credentials exist in ~/.openclaw/openclaw.json under channels.feishu.appId/appSecret, or are passed explicitly
INLINECODE3 and ffprobe are installed and available
local audio transcription is configured in ~/.openclaw/openclaw.json under INLINECODE6

Main scripts:

- INLINECODE7
INLINECODE8

Tasks

1. Transcribe voice input

Use this when you have a local .ogg, .opus, .wav, or similar file and want text.

CODEBLOCK0

This script reuses the existing Whisper CLI configuration from ~/.openclaw/openclaw.json.

2. Generate and send voice output

Use this when you already have text and want to send a Feishu voice message.

CODEBLOCK1

The script will:

1. call OpenAI INLINECODE13
save WAV audio temporarily
convert to Feishu-friendly Opus via INLINECODE14
upload the file to Feishu
send an audio message to the target INLINECODE16

3. Run the full voice loop

Use this skill when the goal is a reusable voice interaction pipeline:

1. transcribe input audio to text
decide or generate the reply text
synthesize reply audio with OpenAI TTS
send the reply back to Feishu

Read references/input-output-workflow.md when building or explaining the end-to-end loop.

Default output style

Default preset is stored in references/presets.md.

Unless the user asks otherwise, use:

- model: INLINECODE19
voice: INLINECODE20
default style: 年轻日系男声感、温柔里带一点撩、贴耳边私聊感、自然、不播音腔

When the user asks for a different flavor, either:

- pass a custom INLINECODE21
or adapt one of the presets in INLINECODE22

Handle failures

Common failure cases:

- Missing OPENAI_API_KEY → ask for API key / env setup
HTTP 429 from OpenAI → billing or quota issue
missing Feishu app credentials → configure INLINECODE24
missing ffmpeg or ffprobe → install locally before retrying
missing transcription model config → configure INLINECODE27

When OpenAI billing is not enabled, say so directly instead of pretending the voice was generated.

Packaging and sharing

Package with:

CODEBLOCK2

The resulting .skill file can be shared or uploaded wherever the user distributes skills.

Resources

scripts/openaittsfeishu.py

Use for deterministic TTS generation and Feishu delivery.

scripts/transcribe_audio.py

Use for deterministic local audio transcription via the configured Whisper CLI.

references/presets.md

Read when the user asks for a different voice direction or wants named presets.

references/input-output-workflow.md

Read when packaging or explaining the complete voice-in / voice-out solution.

Feishu 语音循环

为 OpenClaw 提供可复用的三步语音循环：

1. 接收文本或语音输入
使用 OpenAI TTS 生成语音
将音频返回给飞书或 网页播放器

当输入为语音时，先将其转录为文本，然后继续通过相同的输出管道。

快速开始

前置条件：

- 已设置 OPENAIAPIKEY 用于 TTS
飞书应用凭证存在于 ~/.openclaw/openclaw.json 的 channels.feishu.appId/appSecret 下，或显式传入
已安装 ffmpeg 和 ffprobe 并可用
本地音频转录已在 ~/.openclaw/openclaw.json 的 tools.media.audio.models 下配置

主要脚本：

- scripts/openaittsfeishu.py
scripts/transcribe_audio.py

任务

1. 转录语音输入

当你有本地的 .ogg、.opus、.wav 或类似文件并需要文本时使用。

bash
python3 scripts/transcribe_audio.py /path/to/input.ogg

此脚本复用 ~/.openclaw/openclaw.json 中现有的 Whisper CLI 配置。

2. 生成并发送语音输出

当你已有文本并希望发送飞书语音消息时使用。

bash
python3 scripts/openaittsfeishu.py \
--to openid> \
--text 这条是语音测试。 \
--voice alloy \
--model gpt-4o-mini-tts

该脚本将：

1. 调用 OpenAI audio/speech
临时保存 WAV 音频
通过 ffmpeg 转换为飞书友好的 Opus 格式
将文件上传到飞书
向目标 open_id 发送 audio 消息

3. 运行完整语音循环

当目标是可复用的语音交互管道时使用此技能：

1. 将输入音频转录为文本
决定或生成回复文本
使用 OpenAI TTS 合成回复音频
将回复发送回飞书

在构建或解释端到端循环时，请阅读 references/input-output-workflow.md。

默认输出风格

默认预设存储在 references/presets.md 中。

除非用户另有要求，否则使用：

- 模型：gpt-4o-mini-tts
音色：alloy
默认风格：年轻日系男声感、温柔里带一点撩、贴耳边私聊感、自然、不播音腔

当用户要求不同风格时，可以：

- 传入自定义 --instructions
或调整 references/presets.md 中的某个预设

处理故障

常见故障情况：

- 缺少 OPENAIAPIKEY → 请求提供 API 密钥/环境设置
OpenAI 返回 HTTP 429 → 计费或配额问题
缺少飞书应用凭证 → 配置 channels.feishu.appId/appSecret
缺少 ffmpeg 或 ffprobe → 在重试前本地安装
缺少转录模型配置 → 配置 tools.media.audio.models

当 OpenAI 计费未启用时，直接说明，而不是假装生成了语音。

打包与分享

使用以下命令打包：

bash
python3 /Users/zoepeng/.openclaw/lib/nodemodules/openclaw/skills/skill-creator/scripts/packageskill.py \
/Users/zoepeng/.openclaw/workspace/skills/openai-feishu-voice

生成的 .skill 文件可以分享或上传到用户分发技能的任何地方。

资源

scripts/openaittsfeishu.py

用于确定性的 TTS 生成和飞书投递。

scripts/transcribe_audio.py

用于通过配置的 Whisper CLI 进行确定性的本地音频转录。

references/presets.md

当用户要求不同的语音方向或想要命名预设时阅读。

references/input-output-workflow.md

在打包或解释完整的语音输入/语音输出解决方案时阅读。

feishu-voice-loop飞书语音循环

feishu-voice-loop

Feishu Voice Loop

Quick start