SenseAudio Language Tutor
Create interactive language-learning audio with official SenseAudio TTS endpoints and parameters.
What This Skill Does
- - Generate pronunciation examples in supported voices
- Create bilingual vocabulary and sentence practice audio
- Produce slowed-speed listening drills for learners
- Build short dialogue exercises with repetition pauses
- Export lesson audio files and companion study notes
Credential and Dependency Rules
- - Read the API key from
SENSEAUDIO_API_KEY. - Send auth only as
Authorization: Bearer <API_KEY>. - Do not place API keys in query parameters, logs, or saved examples.
- If Python helpers are used, this skill expects
python3, requests, and pydub. - INLINECODE5 may also require a local audio backend such as
ffmpeg; if unavailable, prefer writing individual audio files instead of merging them.
Official TTS Constraints
Use the official SenseAudio TTS rules summarized below:
- - HTTP endpoint: INLINECODE7
- Model: INLINECODE8
- Max text length:
10000 characters - INLINECODE10 is required
- INLINECODE11 range: INLINECODE12
- Optional audio format values:
mp3, wav, pcm, INLINECODE16 - Optional sample rates:
8000, 16000, 22050, 24000, 32000, INLINECODE22 - Optional MP3 bitrates:
32000, 64000, 128000, INLINECODE26 - Optional channels:
1 or INLINECODE28
Recommended Workflow
- 1. Prepare lesson content:
- - Split vocabulary, example sentences, and dialogues into short chunks.
- Keep each API call comfortably below the
10000 character limit.
- 2. Build minimal TTS requests:
- - Send
model, text, stream, and voice_setting.voice_id. - Add
speed, pitch, vol, and audio_setting only when needed.
- 3. Decode and save audio safely:
- - HTTP responses return hex-encoded audio in
data.audio; decode before saving. - Keep filenames deterministic and avoid exposing secrets in paths or logs.
- 4. Compose lessons carefully:
- - If
pydub and an audio backend are available, merge clips and insert silence. - Otherwise, emit per-word or per-sentence clips and a manifest/Markdown study guide.
- 5. Handle failures and traceability:
- - Check HTTP status and provider error payloads before decoding audio.
- Record
trace_id only for troubleshooting and avoid showing it unless needed.
Minimal Helper
CODEBLOCK0
Patterns
Vocabulary Drill
- - Generate one clip for the target word
- Generate one clip for an example sentence
- Optionally generate a slower clip at INLINECODE41
- Save clips separately or merge with pauses
Bilingual Lesson
- - Alternate source phrase and translated phrase
- Use short pauses (
1000-2000ms) between clips - Consider different
voice_id values for source and translation when helpful
Dialogue Practice
- - Create one clip per line of dialogue
- Insert repetition pauses after each line
- Prefer shorter turns for easier debugging and regeneration
Output Options
- - Individual MP3 clips for words, sentences, or dialogue turns
- Merged lesson audio if local audio tooling is available
- Markdown study guide with transcript, translation, and file manifest
Safety Notes
- - Do not hardcode credentials.
- Do not claim unsupported language-selection parameters for TTS unless the official docs add them.
- Avoid assuming raw bytes can be passed directly to
pydub.AudioSegment; decode and load through a supported container format.
SenseAudio 语言导师
使用官方 SenseAudio TTS 端点和参数创建交互式语言学习音频。
该技能的功能
- - 使用支持的语音生成发音示例
- 创建双语词汇和句子练习音频
- 为学习者生成慢速听力练习
- 构建带有重复停顿的简短对话练习
- 导出课程音频文件和配套学习笔记
凭证和依赖规则
- - 从 SENSEAUDIOAPIKEY 读取 API 密钥
- 仅以 Authorization: Bearer 形式发送认证信息
- 不要将 API 密钥放在查询参数、日志或保存的示例中
- 如果使用 Python 辅助工具,该技能需要 python3、requests 和 pydub
- pydub 可能还需要本地音频后端,如 ffmpeg;如果不可用,建议编写单个音频文件而不是合并它们
官方 TTS 约束
使用以下总结的官方 SenseAudio TTS 规则:
- - HTTP 端点:POST https://api.senseaudio.cn/v1/t2av2
- 模型:SenseAudio-TTS-1.0
- 最大文本长度:10000 个字符
- voicesetting.voiceid 为必填项
- voicesetting.speed 范围:0.5-2.0
- 可选音频格式值:mp3、wav、pcm、flac
- 可选采样率:8000、16000、22050、24000、32000、44100
- 可选 MP3 比特率:32000、64000、128000、256000
- 可选声道数:1 或 2
推荐工作流程
- 1. 准备课程内容:
- 将词汇、例句和对话拆分成短片段
- 确保每次 API 调用内容远低于 10000 字符限制
- 2. 构建最小化 TTS 请求:
- 发送 model、text、stream 和 voice
setting.voiceid
- 仅在需要时添加 speed、pitch、vol 和 audio_setting
- 3. 安全解码和保存音频:
- HTTP 响应在 data.audio 中返回十六进制编码的音频;保存前先解码
- 保持文件名确定性,避免在路径或日志中暴露密钥
- 4. 仔细组合课程:
- 如果 pydub 和音频后端可用,合并片段并插入静音
- 否则,输出每个词或每个句子的片段以及清单/Markdown 学习指南
- 5. 处理失败和可追溯性:
- 在解码音频前检查 HTTP 状态和提供商错误信息
- 仅记录 trace_id 用于故障排除,除非必要否则不显示
最小化辅助工具
python
import binascii
import os
import requests
APIKEY = os.environ[SENSEAUDIOAPI_KEY]
APIURL = https://api.senseaudio.cn/v1/t2av2
def generatetts(text, voiceid=male0004a, speed=1.0, stream=False):
payload = {
model: SenseAudio-TTS-1.0,
text: text,
stream: stream,
voice_setting: {
voiceid: voiceid,
speed: speed,
},
audio_setting: {
format: mp3,
sample_rate: 32000,
bitrate: 128000,
channel: 2,
},
}
response = requests.post(
API_URL,
headers={
Authorization: fBearer {API_KEY},
Content-Type: application/json,
},
json=payload,
timeout=60,
)
response.raiseforstatus()
data = response.json()
audio_hex = data[data][audio]
return binascii.unhexlify(audiohex), data.get(traceid)
模式
词汇练习
- - 为目标单词生成一个片段
- 为例句生成一个片段
- 可选地以 speed=0.8 生成一个慢速片段
- 分别保存片段或合并并加入停顿
双语课程
- - 交替使用源短语和翻译短语
- 在片段之间使用短停顿(1000-2000ms)
- 在有益的情况下,为源语言和翻译使用不同的 voice_id
对话练习
- - 每行对话生成一个片段
- 每行后插入重复停顿
- 优先使用较短的对话轮次,便于调试和重新生成
输出选项
- - 单词、句子或对话轮次的独立 MP3 片段
- 如果本地音频工具可用,合并的课程音频
- 包含转录、翻译和文件清单的 Markdown 学习指南
安全注意事项
- - 不要硬编码凭证
- 除非官方文档添加,否则不要声称支持未列出的语言选择参数
- 避免假设原始字节可以直接传递给 pydub.AudioSegment;通过支持的容器格式解码和加载