返回顶部
d

douyin-transcriber

Audio/video transcription module using Docker Whisper ASR. Extract speech from audio or video files and convert to text. Use when: (1) Transcribing audio files (mp3, wav, m4a, etc.), (2) Transcribing video files (mp4, mkv, etc.), (3) Need speech-to-text for any media file, (4) Working with douyin/tiktok video transcription workflows. Supports automatic audio extraction, format conversion, and multiple Whisper models.

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.0.5
安全检测
已通过
68
下载量
0
收藏
概述
安装方式
版本历史

douyin-transcriber

# Douyin Transcriber Transcribe audio/video files to text using local Docker Whisper ASR. ## Quick Start ```bash curl -X POST "http://localhost:PORT/asr" -F "audio_file=@/path/to/video.mp4" ``` The container has built-in ffmpeg for automatic audio extraction. ## Prerequisites | Tool | Purpose | Install | |------|---------|---------| | Docker | Whisper ASR | Docker Desktop | | ffmpeg | Audio extraction | `winget install Gyan.FFmpeg` | **Deploy Whisper ASR:** ```bash docker run -d -p PORT:PORT -e ASR_MODEL=small -e ASR_ENGINE=faster_whisper --name whisper-asr onerahmet/openai-whisper-asr-webservice:latest ``` ## Workflow ### Step 1: Extract Audio from Video ```bash ffmpeg -i video.mp4 -ar 16000 -ac 1 -c:a pcm_s16le audio.wav -y ``` Parameters: - `-ar 16000`: 16kHz sample rate - `-ac 1`: Mono channel - `-c:a pcm_s16le`: 16-bit PCM ### Step 2: Transcribe ```bash curl -X POST "http://localhost:PORT/asr" -F "audio_file=@audio.wav" ``` Optional: specify language ```bash curl -X POST "http://localhost:PORT/asr" -F "audio_file=@audio.wav" -F "language=zh" ``` ### Step 3: Parse Result Response format: ```json { "text": "Transcribed content...", "segments": [ {"start": 0.0, "end": 2.5, "text": "First sentence"}, {"start": 2.5, "end": 5.0, "text": "Second sentence"} ], "language": "zh" } ``` ## Model Selection | Model | Size | 5-min video | Accuracy | |-------|------|-------------|----------| | tiny | 75MB | ~30s | Fair | | base | 142MB | ~1min | Good | | small | 466MB | ~3min | Better (recommended) | | medium | 1.5GB | ~8min | Best | Change model via environment variable: `-e ASR_MODEL=medium` ## Supported Formats **Video:** mp4, mkv, avi, mov, flv, wmv, webm, m4v **Audio:** wav, m4a, mp3, aac, ogg, flac, wma, opus ## Troubleshooting | Issue | Solution | |-------|----------| | Docker not available | Install Docker Desktop | | Container start fails | Check port availability | | Transcription timeout | Use smaller model or split audio | | ffmpeg not found | `winget install Gyan.FFmpeg` | ## Related Modules - **douyin-fetcher** - Video download - **douyin-analyzer** - Content analysis - **douyin-orchestrator** - Workflow coordination

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 douyin-transcriber-1775899935 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 douyin-transcriber-1775899935 技能

通过命令行安装

skillhub install douyin-transcriber-1775899935

下载 Zip 包

⬇ 下载 douyin-transcriber v1.0.5

文件大小: 1.89 KB | 发布时间: 2026-4-12 09:46

v1.0.5 最新 2026-4-12 09:46
- Added clear usage instructions and workflow for audio/video transcription using Docker Whisper ASR.
- Detailed prerequisite tools and installation steps.
- Included command examples for extracting audio, transcribing, specifying language, and parsing results.
- Provided table for model selection, supported formats, and troubleshooting common issues.
- Listed related modules for extended Douyin/TikTok workflows.

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部