Tencent TTS Podcast Generator

Convert text content to podcast audio files using Tencent Cloud TTS service.

Capabilities

What This Skill Can Do

- Short & Long Text Compatible: Intelligently detects text length, processes short text directly, auto-chunks long text
Long Text to Speech: Supports generating podcasts up to 30 minutes long (~7200 characters)
Concurrent Processing: Long texts are automatically split and processed in parallel for faster generation
26 Voices: Supports basic, featured, customer service, and Tencent featured voices
Smart Chunking: Splits text at semantic boundaries (paragraph/sentence) for natural audio flow
Duration Estimation: Automatically estimates generated audio duration
Auto Retry: Automatically retries failed requests to improve success rate

Short & Long Text Processing Strategy

Note: Tencent Cloud TTS single request limit is ~150 characters. Texts exceeding this will be auto-chunked.

Text Type	Length Range	Processing Method	Concurrency	Timeout
Ultra Short	≤50 chars	Direct request	1	30s
Short

50-150 chars | Direct request | 1 | 30s | | Medium | 150-500 chars | Auto-chunk (2-4 chunks) | 2-3 | 60s | | Long | 500-2000 chars | Auto-chunk (4-14 chunks) | 3-5 | 60s | | Extra Long | 2000-7200 chars | Auto-chunk (14-50 chunks) | 3-5 | 60s |

What This Skill Does NOT Do

- Does not generate mp3 format (wav only)
Does not support background music or sound effects
Does not auto-generate podcast scripts (user must provide)
Does not support dual-speaker dialogue mode (single voice only)

File Structure

This Skill consists of the following files:

- INLINECODE0

Main entry script - Tencent Cloud TTS signature generation - Audio file generation - COS upload functionality

- INLINECODE1

AgentScope tool interface wrapper

- INLINECODE2

This file, describing Skill capabilities, boundaries, and usage conventions

- INLINECODE3

Python dependency configuration

Input & Output Specifications

Input Parameters

Parameter	Description	Required	Default
INLINECODE4	Text content to convert	Yes	-
INLINECODE5

Voice ID (see voice table below, either this or VoiceName) | No | 502006 | | VoiceName | Voice name (see voice table below, either this or VoiceType) | No | - | | secret_id | Tencent Cloud SecretId | Yes | - | | secret_key | Tencent Cloud SecretKey | Yes | - | | max_workers | Concurrent threads (3-5 for long text, 1 for short) | No | 3 | | chunk_size | Chunk size in characters (long text optimization) | No | 140 | | timeout | Request timeout in seconds | No | 30/60 | | enable_retry | Enable automatic retry | No | true | | max_retries | Max retry attempts | No | 2 | | preserve_paragraphs | Preserve paragraph boundaries when chunking | No | true | | cos_secret_id | Tencent Cloud COS SecretId (optional, defaults to TTS credentials) | No | - | | cos_secret_key | Tencent Cloud COS SecretKey (optional, defaults to TTS credentials) | No | - | | upload_cos | Whether to upload to COS, true/false (default false, local only) | No | false | | bucket_name | COS Bucket name (default: ti-aoi) | No | ti-aoi | | app_id | COS App ID (default: 1257195185) | No | 1257195185 | | region | COS region (default: ap-guangzhou) | No | ap-guangzhou |

Output

CODEBLOCK0

Usage

Environment Requirements

- Python 3.8+
tencentcloud-sdk-python
cos-python-sdk-v5
requests

Install Dependencies

CODEBLOCK1

Basic Usage

CODEBLOCK2

Short Text Optimized Usage

CODEBLOCK3

Long Text Optimized Usage

CODEBLOCK4

Voice Reference

VoiceType	Voice Name	Characteristics
0	普通女声	Standard female
1

普通男声 | Standard male |
| 5 | 情感女声 | Emotional female |
| 6 | 情感男声 | Emotional male |
| 1000 | 智障少女 | Lively cute |
| 1001 | 阳光少年 | Bright youthful |
| 1002 | 温柔淑女 | Gentle female |
| 1003 | 成熟青年 | Mature male |
| 1004 | 严厉管事 | Stern female |
| 1005 | 亲和女声 | Friendly female |
| 1006 | 甜美女声 | Sweet female |
| 1007 | 磁性男声 | Magnetic male |
| 1008 | 播音主播 | Broadcast anchor |
| 101001 | 客服女声 | Customer service |
| 101005 | 售前客服 | Pre-sales service |
| 101007 | 售后客服 | After-sales service |
| 101008 | 亲和客服 | Friendly service |
| 502006 | 小旭 | Tencent voice |
| 502007 | 小巴 | Tencent voice |
| 502008 | 思驰 | Tencent voice |
| 502009 | 思佳 | Tencent voice |
| 502010 | 思悦 | Tencent voice |
| 502011 | 小宁 | Tencent voice |
| 502012 | 小杨 | Tencent voice |
| 502013 | 云扬 | Tencent voice |
| 502014 | 云飞 | Tencent voice |

Technical Architecture

tts_podcast.py

- TTS: Uses Tencent Cloud TTS API signature v3
Upload: Uses Tencent Cloud COS SDK for audio file upload
Auth: Supports credentials from parameters or environment variables
Short & Long Text Compatible:

- Short text (≤150 chars): Direct single request, fast response - Long text (>150 chars): Smart chunking + concurrent processing + auto-merge

Text Chunking Strategy

1. Paragraph Priority: Try to preserve paragraph integrity, split at paragraph boundaries
Sentence Boundaries: When paragraphs are too long, split at sentence ending punctuation (。！？；)
Semantic Protection: Avoid truncating in the middle of words, ensure semantic coherence
Length Control: Each chunk does not exceed 150 characters (Tencent Cloud API limit)

License

MIT

技能名称: Tencent TTS Podcast Generator

详细描述:

Tencent TTS 播客生成器

使用腾讯云 TTS 服务将文本内容转换为播客音频文件。

功能

此技能可执行的操作

- 短文本与长文本兼容：智能检测文本长度，短文本直接处理，长文本自动分块
长文本转语音：支持生成长达30分钟的播客（约7200个字符）
并发处理：长文本自动拆分并并行处理，加快生成速度
26种音色：支持基础、特色、客服和腾讯特色音色
智能分块：在语义边界（段落/句子）处拆分文本，实现自然音频流
时长预估：自动估算生成的音频时长
自动重试：自动重试失败的请求，提高成功率

短文本与长文本处理策略

注意：腾讯云 TTS 单次请求限制约为150个字符。超出此限制的文本将自动分块。

文本类型	长度范围	处理方法	并发数	超时时间
超短文本	≤50字符	直接请求	1	30秒
短文本

50-150字符 | 直接请求 | 1 | 30秒 | | 中等文本 | 150-500字符 | 自动分块（2-4块） | 2-3 | 60秒 | | 长文本 | 500-2000字符 | 自动分块（4-14块） | 3-5 | 60秒 | | 超长文本 | 2000-7200字符 | 自动分块（14-50块） | 3-5 | 60秒 |

此技能不执行的操作

- 不生成 mp3 格式（仅支持 wav）
不支持背景音乐或音效
不自动生成播客脚本（用户必须提供）
不支持双人对话模式（仅单音色）

文件结构

此技能包含以下文件：

- tts_podcast.py

主入口脚本 - 腾讯云 TTS 签名生成 - 音频文件生成 - COS 上传功能

- tts_tool.py

AgentScope 工具接口封装

- SKILL.md

本文件，描述技能功能、边界和使用约定

- requirements.txt

Python 依赖配置

输入与输出规范

输入参数

参数	描述	必填	默认值
Text	要转换的文本内容	是	-
VoiceType

音色 ID（见下方音色表，与 VoiceName 二选一） | 否 | 502006 | | VoiceName | 音色名称（见下方音色表，与 VoiceType 二选一） | 否 | - | | secret_id | 腾讯云 SecretId | 是 | - | | secret_key | 腾讯云 SecretKey | 是 | - | | max_workers | 并发线程数（长文本3-5，短文本1） | 否 | 3 | | chunk_size | 分块大小（字符数，用于长文本优化） | 否 | 140 | | timeout | 请求超时时间（秒） | 否 | 30/60 | | enable_retry | 启用自动重试 | 否 | true | | max_retries | 最大重试次数 | 否 | 2 | | preserve_paragraphs | 分块时保留段落边界 | 否 | true | | cossecretid | 腾讯云 COS SecretId（可选，默认使用 TTS 凭证） | 否 | - | | cossecretkey | 腾讯云 COS SecretKey（可选，默认使用 TTS 凭证） | 否 | - | | upload_cos | 是否上传到 COS，true/false（默认 false，仅本地） | 否 | false | | bucket_name | COS 存储桶名称（默认：ti-aoi） | 否 | ti-aoi | | app_id | COS App ID（默认：1257195185） | 否 | 1257195185 | | region | COS 区域（默认：ap-guangzhou） | 否 | ap-guangzhou |

输出

json
{
Code: 0,
Msg: success,
AudioUrl: https://xxx.cos.ap-guangzhou.myqcloud.com/xxx.wav
}

使用方法

环境要求

- Python 3.8+
tencentcloud-sdk-python
cos-python-sdk-v5
requests

安装依赖

bash pip install -r requirements.txt

基本用法

python
from tts_podcast import main

result = main({
Text: 你好，欢迎收听今天的播客。,
VoiceType: 502006,
secretid: YOURSECRET_ID,
secretkey: YOURSECRET_KEY
})

print(result)

{Code: 0, Msg: success, AudioUrl: https://...}

短文本优化用法

python

短文本（<150字符）- 使用单线程快速响应

result = main({
Text: 你好，这是一条短消息。,
VoiceType: 502006,
secretid: YOURSECRET_ID,
secretkey: YOURSECRET_KEY,
max_workers: 1, # 单线程足够
timeout: 30, # 30秒超时
enable_retry: True # 启用重试
})

长文本优化用法

python

长文本（>150字符）- 使用并发提高速度

long_text = 第一章：人工智能的起源

人工智能的概念可以追溯到古希腊神话...

result = main({
Text: long_text,
VoiceType: 502007,
secretid: YOURSECRET_ID,
secretkey: YOURSECRET_KEY,
max_workers: 5, # 并发处理
chunk_size: 140, # 每块140字符
timeout: 60, # 60秒超时
preserve_paragraphs: True # 保留段落边界
})

音色参考

VoiceType	音色名称	特点
0	普通女声	标准女声
1

普通男声 | 标准男声 |
| 5 | 情感女声 | 情感女声 |
| 6 | 情感男声 | 情感男声 |
| 1000 | 智障少女 | 活泼可爱 |
| 1001 | 阳光少年 | 阳光青春 |
| 1002 | 温柔淑女 | 温柔女声 |
| 1003 | 成熟青年 | 成熟男声 |
| 1004 | 严厉管事 | 严厉女声 |
| 1005 | 亲和女声 | 亲和女声 |
| 1006 | 甜美女声 | 甜美女声 |
| 1007 | 磁性男声 | 磁性男声 |
| 1008 | 播音主播 | 播音主播 |
| 101001 | 客服女声 | 客服女声 |
| 101005 | 售前客服 | 售前客服 |
| 101007 | 售后客服 | 售后客服 |
| 101008 | 亲和客服 | 亲和客服 |
| 502006 | 小旭 | 腾讯音色 |
| 502007 | 小巴 | 腾讯音色 |
| 502008 | 思驰 | 腾讯音色 |
| 502009 | 思佳 | 腾讯音色 |
| 502010 | 思悦 | 腾讯音色 |
| 502011 | 小宁 | 腾讯音色 |
| 502012 | 小杨 | 腾讯音色 |
| 502013 | 云扬 | 腾讯音色 |
| 502014 | 云飞 | 腾讯音色 |

技术架构

tts_podcast.py

- TTS：使用腾讯云 TTS API 签名 v3
上传：使用腾讯云 COS SDK 上传音频文件
认证：支持通过参数或环境变量提供凭证
短文本与长文本兼容：

- 短文本（≤150字符）：直接单次请求，快速响应 - 长文本（>150字符）：智能分块 + 并发处理 + 自动合并

Tencent TTS Podcast Generator腾讯TTS播客生成器