飞书语音识别 ASR

触发条件

- 用户发送飞书语音消息
用户要求将语音转为文字
用户提到语音识别、转文字

工作流程

1. 获取语音文件

从飞书消息中获取语音文件的file_key，下载为.ogg或.m4a格式。

2. 音频格式转换

使用Python soundfile将音频转换为16kHz采样的WAV格式：

python
import soundfile as sf
audio, sr = sf.read(voice_file)

如果是立体声，转为单声道

if len(audio.shape) > 1:
audio = audio.mean(axis=1)
sf.write(output.wav, audio, 16000)

3. 使用Whisper识别

python
import os
os.environ[HF_ENDPOINT] = https://hf-mirror.com # 国内镜像

from transformers import WhisperForConditionalGeneration, WhisperProcessor, WhisperFeatureExtractor
import soundfile as sf

读取音频

audio, sr = sf.read(output.wav) if len(audio.shape) > 1: audio = audio.mean(axis=1)

加载模型

processor = WhisperProcessor.from_pretrained(openai/whisper-tiny) model = WhisperForConditionalGeneration.from_pretrained(openai/whisper-tiny) featureextractor = WhisperFeatureExtractor.frompretrained(openai/whisper-tiny)

识别

inputfeatures = featureextractor(audio, samplingrate=16000, returntensors=pt).input_features with torch.no_grad(): predictedids = model.generate(inputfeatures)

result = processor.batchdecode(predictedids, skipspecialtokens=True)[0]

依赖安装

bash
pip install torch transformers soundfile

模型选择

- whisper-tiny: 75MB，适合CPU，最快
whisper-base: 142MB，精度更好
whisper-small: 466MB，精度高

注意事项

- 首次运行需要下载模型（约75MB-3GB）
建议使用国内镜像：HF_ENDPOINT=https://hf-mirror.com
模型会自动检测语言

feishu-asr飞书语音识别

feishu-asr

飞书语音识别 ASR

触发条件

工作流程

1. 获取语音文件

2. 音频格式转换

3. 使用Whisper识别

依赖安装

模型选择

注意事项

飞书语音识别 ASR

触发条件

工作流程

1. 获取语音文件

2. 音频格式转换

如果是立体声，转为单声道

3. 使用Whisper识别

读取音频

加载模型

识别

依赖安装

模型选择

注意事项

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

feishu-asr飞书语音识别

feishu-asr

飞书语音识别 ASR

触发条件

工作流程

1. 获取语音文件

2. 音频格式转换

3. 使用Whisper识别

依赖安装

模型选择

注意事项

飞书语音识别 ASR

触发条件

工作流程

1. 获取语音文件

2. 音频格式转换

如果是立体声，转为单声道

3. 使用Whisper识别

读取音频

加载模型

识别

依赖安装

模型选择

注意事项

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement