Manim Animation: Animation + Voiceover + Subtitle Generator
Author: ericksun(孙自翔)
Overview
This skill uses Manim Community to generate mathematical/educational animations, with manim-voiceover plugin integration for TTS voice narration and synchronized subtitles. All processing runs locally — no paid API required.
Core Capabilities:
- - 🎬 Animation Generation: Create animations of math formulas, geometric shapes, charts, and more with Manim
- 🎙️ Voice Narration: Integrate TTS via manim-voiceover plugin with automatic animation-voice sync
- 📝 Subtitle System: In-scene subtitles (Manim Text) + SRT external subtitles (ffmpeg burn-in)
- 🔄 One-Click Pipeline: Describe requirements → Generate code → Render video → Burn subtitles
TTS Engines (gTTS preferred):
- - gTTS (Recommended): Google free TTS, supports Chinese, no API Key needed
- pyttsx3 (Fallback): Offline TTS, no network required
- Azure/OpenAI/ElevenLabs (High quality): Requires paid API Key
Prerequisites
🔍 One-Click Environment Check
Before first use, run the environment check script to verify all dependencies are ready:
CODEBLOCK0
This script checks:
- - ✅ Manim Community installation (
manim command) - ✅ manim-voiceover + gTTS plugin
- ✅ FFmpeg + libx264 encoder (hardcoded Manim dependency, required)
- ✅ FFmpeg + libass (for SRT subtitle burn-in)
- ✅ Python dependencies
- ✅ Chinese font availability
Required System Tools
- - Manim Community: INLINECODE1
- FFmpeg (with libx264 + libass): Manim hardcodes the
libx264 encoder for video rendering; subtitle burn-in requires INLINECODE3
- macOS (Homebrew):
brew install ffmpeg (includes x264 and libass by default)
- macOS (Conda):
conda install x264 -c conda-forge (
⚠️ conda's ffmpeg does not include libx264 by default)
- Linux:
sudo apt install ffmpeg libx264-dev libass-dev
Required Python Packages
CODEBLOCK1
Optional (Enhanced Features)
- - pyttsx3: Offline TTS (
pip install "manim-voiceover[pyttsx3]")
⚡ Quick Install
CODEBLOCK2
Workflow
Quick Start — One-Click Run
After the user describes their requirements, use the pipeline script for one-click execution:
CODEBLOCK3
Common Options:
| Option | Default | Description |
|---|
| INLINECODE8 | Required | Manim scene Python file |
| INLINECODE9 |
Required | Scene class name |
|
--quality |
high | Render quality:
low/
medium/
high/
production |
|
--burn_subtitles | False | Whether to burn SRT subtitles with ffmpeg |
|
--speed |
1.35 | Playback speed multiplier (e.g., 1.35 means 1.35x speed; set to 1.0 to disable) |
|
--preview | False | Auto-open preview after rendering |
|
--output_dir |
./output | Output directory |
Complete Workflow (4 Steps)
Step 1: Understand User Requirements and Generate Manim Scene Code
Based on the user's description, generate a Manim scene Python file. Scene code should follow these patterns:
No-voiceover mode (animation only):
CODEBLOCK4
Voiceover mode (animation + voice + subtitles):
CODEBLOCK5
Key Pattern — voiceover context manager:
CODEBLOCK6
INLINECODE22 does three things:
- 1. Calls the TTS engine to generate speech audio
- Automatically calculates speech duration
- Provides
tracker.duration to sync animations with voice
Subtitle Best Practices:
- - In-scene subtitles: Use the
_make_subtitle() helper to display white bold text with dark background at the bottom of the screen - Overflow prevention:
_make_subtitle() auto-detects subtitle width and scales proportionally (scale_to_fit_width) when exceeding frame bounds; uses font_size=22 for long text - Subtitle sync:
FadeIn(sub) in the first self.play() within the voiceover block ensures subtitles appear in sync with voice — do not delay - FadeIn subtitle at the start of each voiceover block, FadeOut after it ends
- Subtitle text should match the voiceover text
⚠️ Avoid Double Subtitles: If the scene code already uses _make_subtitle() to render in-scene subtitles, do not also use --burn_subtitles to burn SRT subtitles, otherwise two overlapping subtitle layers will appear. Choose only one approach:
- - Option A (Recommended): Render subtitles in code with
_make_subtitle(), do not burn SRT - Option B: Do not render subtitles in code, burn SRT via INLINECODE33
Step 2: Configure Rendering Parameters
Create manim.cfg in the same directory as the scene file:
CODEBLOCK7
Quality Reference Table:
| Quality | Flag | Resolution | FPS | manim.cfg Value |
|---|
| Low | -ql | 480p | 15 | lowquality |
| Medium |
-qm | 720p | 30 | mediumquality |
| High | -qh | 1080p | 60 | high_quality |
| Production | -qp | 2160p | 60 | production_quality |
Step 3: Render Video
CODEBLOCK8
Output path pattern: INLINECODE35
Step 4: Burn SRT Subtitles (Optional)
manim-voiceover automatically generates .srt subtitle files in the same directory as the video. Burn with ffmpeg:
CODEBLOCK9
⚠️ Double Subtitle Pitfall: If the scene Python code already renders in-scene subtitles with _make_subtitle(), do not also burn SRT subtitles, otherwise two overlapping subtitle layers will appear.
Note: ffmpeg requires libass support. On macOS, brew install ffmpeg typically includes it. Conda environments may require conda install x264 -c conda-forge.
Step 5: Speed Up Video (Optional)
Use ffmpeg to speed up the video, default 1.35x:
CODEBLOCK10
Note: Speed-up should be the final output step. If the scene code has in-scene subtitles (_make_subtitle), the speed-up input should use the original video (not the SRT-burned version) to avoid double subtitles. The run_pipeline.py --speed parameter handles this logic automatically.
Manim Common Animation Reference
Create/Display Animations
- -
Write(text) — Write text - INLINECODE44 — Draw shapes
- INLINECODE45 /
FadeOut(mobject) — Fade in/out - INLINECODE47 — Draw border then fill
Transform Animations
- -
Transform(source, target) — Morph - INLINECODE49 — Replacement morph
- INLINECODE50 — Shape-matching morph
Move/Scale
- -
mobject.animate.to_edge(UP) — Move to edge - INLINECODE52 — Translate
- INLINECODE53 — Scale
- INLINECODE54 — Rotate
Common Objects
- -
Text("Text", font_size=48, color=BLUE) — Text - INLINECODE56 — LaTeX formula
- INLINECODE57 — Circle
- INLINECODE58 — Square
- INLINECODE59 — Arrow
- INLINECODE60 — Coordinate plane
- INLINECODE61 — Axes
Grouping and Layout
- -
VGroup(obj1, obj2) — Vertical group - INLINECODE63 — Horizontal arrangement
- INLINECODE64 — Background rectangle
Known Issues and Solutions
⚠️ Missing libx264 Codec (Most Common Issue)
Symptom: INLINECODE65
Root Cause: Manim hardcodes the libx264 encoder in scene_file_writer.py (cannot be overridden via config/cfg), but conda environment's ffmpeg is compiled with --disable-gpl and does not include the GPL-licensed libx264.
Solution:
CODEBLOCK11
Note: brew install ffmpeg installs ffmpeg with built-in x264, but conda environments prioritize their own ffmpeg and will not use the Homebrew version.
setuptools Compatibility
manim-voiceover depends on
pkg_resources, which may fail on Python 3.12+:
CODEBLOCK12
ffmpeg Missing libass
SRT subtitle burn-in requires libass. macOS:
CODEBLOCK13
Linux:
CODEBLOCK14
gTTS Network Issues
gTTS requires access to Google TTS service. If network is unavailable, switch to pyttsx3 offline engine:
CODEBLOCK15
Chinese Fonts
Manim uses system fonts to render
Text objects. Ensure Chinese fonts are available:
- - macOS: PingFang SC (built-in)
- Linux: INLINECODE73
- Specify font: INLINECODE74
Related Resources
- - GitHub Repository: https://github.com/hzsunzixiang/manim-animation-skill
- Manim Technical Guide:
references/manim_guide.md — Detailed Manim + voiceover + subtitle technical documentation - Environment Check Script:
scripts/check_environment.py — One-click dependency check - Render Pipeline Script:
scripts/run_pipeline.py — One-click render + subtitle burn-in
Manim动画:动画 + 配音 + 字幕生成器
作者: ericksun(孙自翔)
概述
本技能使用Manim Community生成数学/教育动画,集成manim-voiceover插件实现TTS语音旁白和同步字幕。所有处理均在本地运行——无需付费API。
核心能力:
- - 🎬 动画生成:使用Manim创建数学公式、几何图形、图表等动画
- 🎙️ 语音旁白:通过manim-voiceover插件集成TTS,实现动画与语音自动同步
- 📝 字幕系统:场景内字幕(Manim文本)+ SRT外部字幕(ffmpeg烧录)
- 🔄 一键流水线:描述需求 → 生成代码 → 渲染视频 → 烧录字幕
TTS引擎(推荐gTTS):
- - gTTS(推荐):谷歌免费TTS,支持中文,无需API密钥
- pyttsx3(备选):离线TTS,无需网络
- Azure/OpenAI/ElevenLabs(高质量):需要付费API密钥
前置条件
🔍 一键环境检查
首次使用前,运行环境检查脚本验证所有依赖是否就绪:
bash
python3 {SKILLDIR}/scripts/checkenvironment.py
该脚本检查:
- - ✅ Manim Community安装(manim命令)
- ✅ manim-voiceover + gTTS插件
- ✅ FFmpeg + libx264编码器(Manim硬编码依赖,必需)
- ✅ FFmpeg + libass(用于SRT字幕烧录)
- ✅ Python依赖
- ✅ 中文字体可用性
必需的系统工具
- - Manim Community:pip install manim
- FFmpeg(含libx264 + libass):Manim在视频渲染中硬编码使用libx264编码器;字幕烧录需要libass
- macOS(Homebrew):brew install ffmpeg(默认包含x264和libass)
- macOS(Conda):conda install x264 -c conda-forge(
⚠️ conda的ffmpeg默认不包含libx264)
- Linux:sudo apt install ffmpeg libx264-dev libass-dev
必需的Python包
bash
核心
pip install manim
配音 + TTS
pip install manim-voiceover[gtts]
可选(增强功能)
- - pyttsx3:离线TTS(pip install manim-voiceover[pyttsx3])
⚡ 快速安装
bash
pip install manim manim-voiceover[gtts]
macOS(Homebrew)— 推荐,包含libx264 + libass
brew install ffmpeg
macOS(Conda)— 需要额外安装x264,否则Manim渲染会报UnknownCodecError: libx264
conda install x264 -c conda-forge
验证ffmpeg支持libx264和libass
ffmpeg -codecs 2>&1 | grep libx264 # 应显示:encoders: libx264
ffmpeg -filters 2>&1 | grep subtitles # 应显示:subtitles filter
工作流程
快速开始 — 一键运行
用户描述需求后,使用流水线脚本一键执行:
bash
python3 {SKILLDIR}/scripts/runpipeline.py \
--scenefile file.py> \
--scene_name \
--quality high \
--burn_subtitles
常用选项:
| 选项 | 默认值 | 描述 |
|---|
| --scenefile | 必需 | Manim场景Python文件 |
| --scenename |
必需 | 场景类名 |
| --quality | high | 渲染质量:low/medium/high/production |
| --burn_subtitles | False | 是否使用ffmpeg烧录SRT字幕 |
| --speed | 1.35 | 播放速度倍率(如1.35表示1.35倍速;设为1.0禁用) |
| --preview | False | 渲染后自动打开预览 |
| --output_dir | ./output | 输出目录 |
完整工作流程(4步)
第1步:理解用户需求并生成Manim场景代码
根据用户描述,生成Manim场景Python文件。场景代码应遵循以下模式:
无配音模式(仅动画):
python
from manim import *
class MyScene(Scene):
def construct(self):
title = Text(标题, font_size=48, color=BLUE)
self.play(Write(title))
self.wait(1)
配音模式(动画 + 语音 + 字幕):
python
from manim import *
from manim_voiceover import VoiceoverScene
from manim_voiceover.services.gtts import GTTSService
class MyScene(VoiceoverScene):
def makesubtitle(self, text_str):
在屏幕底部创建带深色背景的字幕。
sub = Text(textstr, fontsize=22, color=WHITE, weight=BOLD)
# 防止字幕溢出左右边缘
maxwidth = config.framewidth - 1.0 # 每边0.5边距
if sub.width > max_width:
sub.scaletofitwidth(maxwidth)
sub.to_edge(DOWN, buff=0.4)
bg = BackgroundRectangle(sub, color=BLACK, fill_opacity=0.6, buff=0.15)
return VGroup(bg, sub)
def construct(self):
self.setspeechservice(GTTSService(lang=en))
sub_text = 欢迎来到演示
with self.voiceover(text=sub_text) as tracker:
sub = self.makesubtitle(sub_text)
title = Text(演示, font_size=48)
self.play(Write(title), FadeIn(sub), run_time=tracker.duration)
self.play(FadeOut(sub))
self.wait(0.3)
关键模式 — 配音上下文管理器:
python
with self.voiceover(text=语音文本) as tracker:
# tracker.duration = TTS语音时长(秒)
# 此块内的动画自动与语音同步
self.play(SomeAnimation(), run_time=tracker.duration)
with self.voiceover(text=...) as tracker 执行三项操作:
- 1. 调用TTS引擎生成语音音频
- 自动计算语音时长
- 提供tracker.duration以同步动画与语音
字幕最佳实践:
- - 场景内字幕:使用makesubtitle()辅助函数在屏幕底部显示带深色背景的白色粗体文本
- 溢出预防:makesubtitle()自动检测字幕宽度,超出帧边界时按比例缩放(scaletofitwidth);长文本使用fontsize=22
- 字幕同步:在配音块内的第一个self.play()中FadeIn(sub),确保字幕与语音同步出现——不要延迟
- 在每个配音块开始时淡入字幕,结束后淡出
- 字幕文本应与配音文本匹配
⚠️ 避免双重字幕: 如果场景代码已使用makesubtitle()渲染场景内字幕,不要再使用--burn_subtitles烧录SRT字幕,否则会出现两层重叠字幕。只选择一种方式:
- - 方式A(推荐):在代码中使用makesubtitle()渲染字幕,不烧录SRT
- 方式B:不在代码中渲染字幕,通过--burn_subtitles烧录SRT
第2步:配置渲染参数
在与场景文件相同的目录中创建manim.cfg:
ini
[CLI]
quality = high_quality
preview = False
[ffmpeg]
video_codec = h264
质量参考表:
| 质量 | 标志 | 分辨率 | FPS | manim.cfg值 |
|---|
| 低 | -ql | 480p | 15 | lowquality |
| 中 |
-qm | 720p | 30 | mediumquality |
| 高 | -qh | 1080p | 60 | high_quality |
| 生产 | -qp | 2160p | 60 | production_quality |
第3步:渲染视频
bash
manim render
输出路径模式:media/videos///.mp4
第4