Deepgram 语音工作流

概述

使用此技能完成完整的语音工作流：

1. 通过 Deepgram STT 将音频转录为文本
可选地通过 Deepgram TTS 合成语音回复
返回结构化输出，可馈送至聊天或智能体管道

当任务范围超出单纯转录，且需要输入音频到输出音频的管道时，此技能是正确选择。

快速开始

仅转录

bash
{baseDir}/scripts/deepgram-transcribe.sh /path/to/audio.ogg

从文本生成语音

bash
{baseDir}/scripts/deepgram-tts.sh 你好，我是 Neko。

运行完整管道

bash
{baseDir}/scripts/neko-voice-pipeline.sh /path/to/audio.ogg --reply 收到啦，这是语音回复测试。

环境

使用前设置 DEEPGRAMAPIKEY。

附带的脚本也会回退从以下位置读取：

- /root/.openclaw/.env

工作流决策

使用 deepgram-transcribe.sh 当

- 仅需要文本转录
下游系统将自行生成回复
任务仅为语音转文本

使用 deepgram-tts.sh 当

- 文本已存在
仅需要 MP3 语音回复
工作流仅为文本转语音

使用 neko-voice-pipeline.sh 当

- 任务以音频文件开始
需要转录文本
应在同一流程中生成可选的语音回复

输出

STT 输出

deepgram-transcribe.sh 写入：

- 转录文本文件
原始 API JSON 文件（位于同一目录）

TTS 输出

deepgram-tts.sh 写入：

- MP3 输出文件

管道输出

neko-voice-pipeline.sh 打印包含以下内容的 JSON：

- outdir
transcriptpath
transcript
replyaudiopath

这使得可以轻松接入脚本或适配器。

典型用途

优先使用此技能用于：

- 转录 Telegram/QQ/OneBot 语音消息
为简短语音提示生成 MP3 回复
构建机器人端语音输入/输出自动化
从 shell 测试语音管道，无需引入完整 SDK

注意事项

- 默认设置为轻量级实际使用而调优，并非最大可配置性。
deepgram-transcribe.sh 默认使用 model=nova-2 和 language=zh。
deepgram-tts.sh 默认使用 model=aura-2-luna-en；如需不同语音，请覆盖模型。
调试识别质量或 API 错误时，检查原始 JSON 转录响应。

参考

需要时阅读以下文件：

- references/stt-notes.md 了解转录详情
references/tts-notes.md 了解语音合成详情
references/pipeline-notes.md 了解端到端管道行为

deepgram-voice-workflowDeepgram语音工作流

deepgram-voice-workflow

Deepgram Voice Workflow

Overview

Quick Start

Transcribe only

Generate speech from text

Run the full pipeline

Environment

Workflow Decision

Use deepgram-transcribe.sh when

Use deepgram-tts.sh when

Use neko-voice-pipeline.sh when

Outputs

STT output

TTS output

Pipeline output

Typical Uses

Notes

References

Deepgram 语音工作流

概述

快速开始

仅转录

从文本生成语音

运行完整管道

环境

工作流决策

使用 deepgram-transcribe.sh 当

使用 deepgram-tts.sh 当

使用 neko-voice-pipeline.sh 当

输出

STT 输出

TTS 输出

管道输出

典型用途

注意事项

参考

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement

Use `deepgram-transcribe.sh` when

Use `deepgram-tts.sh` when

Use `neko-voice-pipeline.sh` when