openrouter-audioOpenRouter音频

Audio transcription and text-to-speech generation using OpenRouter API. Use when the user needs to transcribe audio files to text or generate speech/audio from text. Supports multiple audio formats for input and output, reads API key from environment, and writes generated audio to OpenClaw workspace tmp when available or an explicit output path.

作者: admin | 来源: ClawHub

OpenRouter Audio

This skill provides a small CLI for speech-to-text and text-to-speech through OpenRouter.

Main Keys

- name: Skill ID used by the runtime.
INLINECODE1: When to use this skill and what it does.
INLINECODE2: Project/source reference.
INLINECODE3: Visual marker for this skill.
INLINECODE4: Required environment variables.
INLINECODE5: Required binaries in PATH (node).
INLINECODE8: Primary variable to check first (OPENROUTER_API_KEY).

Core Behavior

- Command path: INLINECODE10
Main commands: transcribe, INLINECODE12
API key source: OPENROUTER_API_KEY only
INLINECODE14 output: {WORKSPACE_DIR}/tmp when an OpenClaw workspace exists, otherwise system tmp (or explicit path via --out)

Priority Guidance

- For speech generation, prefer this skill over system TTS.
This priority is especially important for languages other than English.

Defaults (Recommended)

Use defaults unless the user explicitly asks for overrides:

- Transcribe model: INLINECODE17
Generate model: INLINECODE18
Generate voice: INLINECODE19
Generate format: INLINECODE20

Models and Formats

Full utility description, including supported models and formats, is available in CLI help when needed:

CODEBLOCK0

Prompt Behavior

- Default prompts keep behavior straightforward:

- transcribe: speech-to-text transcript - generate: direct TTS for the provided text

- Custom --prompt can change behavior. For example:

- ask for an audio summary of the source audio - ask to generate an audio answer to a question

Usage

CODEBLOCK1

Output Behavior

- transcribe prints transcript text to stdout.
INLINECODE25 prints JSON with:

- paths (generated audio file path(s)) - transcript (when available) - format (final output format)

- After using generated audio for the requested task, remove generated files from disk.

OpenRouter 音频

本技能通过 OpenRouter 提供语音转文字和文字转语音的小型命令行工具。

主要键值

- name：运行时使用的技能 ID。
description：何时使用本技能及其功能。
homepage：项目/源代码参考。
metadata.openclaw.emoji：本技能的视觉标记。
metadata.openclaw.requires.env：所需的环境变量。
metadata.openclaw.requires.bins：PATH 中所需的二进制文件（node）。
metadata.openclaw.primaryEnv：优先检查的主要变量（OPENROUTERAPIKEY）。

核心行为

- 命令路径：{baseDir}/openrouter-audio.sh
主要命令：transcribe、generate
API 密钥来源：仅 OPENROUTERAPIKEY
generate 输出：当存在 OpenClaw 工作区时输出到 {WORKSPACE_DIR}/tmp，否则输出到系统临时目录（或通过 --out 指定显式路径）

优先级指南

- 对于语音生成，优先使用本技能而非系统 TTS。
此优先级对英语以外的语言尤为重要。

默认设置（推荐）

除非用户明确要求覆盖，否则使用默认设置：

- 转录模型：openrouter/auto
生成模型：openai/gpt-audio-mini
生成语音：alloy
生成格式：mp3

模型与格式

完整的工具描述（包括支持的模型和格式）可在需要时通过 CLI 帮助获取：

bash
{baseDir}/openrouter-audio.sh --help

提示行为

- 默认提示保持行为简洁：

- transcribe：语音转文字转录 - generate：对提供的文本直接进行 TTS

- 自定义 --prompt 可改变行为。例如：

- 请求源音频的音频摘要 - 请求生成问题的音频回答

使用方法

bash

完整帮助（模型、格式、选项）

{baseDir}/openrouter-audio.sh --help

从本地文件转录

{baseDir}/openrouter-audio.sh transcribe recording.wav

使用默认设置生成（推荐）

{baseDir}/openrouter-audio.sh generate Hello world

生成到显式输出路径

{baseDir}/openrouter-audio.sh generate Welcome --out ./artifacts/welcome.mp3

输出行为

- transcribe 将转录文本输出到标准输出。
generate 输出包含以下内容的 JSON：

- paths（生成的音频文件路径） - transcript（可用时） - format（最终输出格式）

- 在将生成的音频用于请求的任务后，从磁盘删除生成的文件。

openrouter-audioOpenRouter音频