Hum2Song

# Hum2Song Turn a hummed melody into a complete song with local audio processing, without uploading sensitive recordings to third-party services. --- ## Overview This skill converts user humming or singing into complete songs using local AI models. The entire pipeline runs on your machine - no audio data is sent to external services. **Pipeline:** 1. 🎤 Audio Input → 2. 🎵 MIDI Extraction → 3. 🎼 Music Generation → 4. 🎧 Complete Song --- ## Triggers Use this skill when the user: - Hums or sings a melody and wants to turn it into a full song - Has an audio recording of humming/singing - Wants to create music from their own melodic ideas - Asks to "turn my humming into a song" --- ## Requirements ### System Dependencies ```bash # macOS brew install ffmpeg fluidsynth # Ubuntu/Debian sudo apt-get install ffmpeg fluidsynth # Python packages pip install basic-pitch pretty_midi librosa soundfile numpy ``` ### Optional: ACE-Step for Music Generation (User Choice) ACE-Step is an optional local AI. Users decide whether to install it. ```bash # User manually installs if they want AI generation # Otherwise, default SoundFont synthesis works without AI git clone https://github.com/ace-step/ace-step.git pip install -r ace-step/requirements.txt ``` **Note:** First use downloads ~4GB model weights to local cache. No automatic downloads. --- ## Core Workflow ### Step 1: Extract MIDI from Audio Use Basic Pitch (Spotify's open source tool) to convert humming to MIDI: ```python from basic_pitch.inference import predict from basic_pitch import ICASSP_2022_MODEL_PATH # Convert audio to MIDI model_output, midi_data, note_events = predict("humming.wav") midi_data.write("extracted.mid") ``` ### Step 2: Enhance MIDI Structure Clean and enhance the extracted MIDI: ```python import pretty_midi # Load extracted MIDI pm = pretty_midi.PrettyMIDI("extracted.mid") # Quantize notes to fix timing for instrument in pm.instruments: for note in instrument.notes: note.start = round(note.start * 4) / 4 # Quantize to 16th notes note.end = round(note.end * 4) / 4 # Save enhanced MIDI pm.write("enhanced.mid") ``` ### Step 3: Generate Full Song **Option A: ACE-Step (Local AI, Optional)** Only if user has manually installed ACE-Step: ```python from ace_step import MusicGenerator # Load model (runs locally, downloads weights on first use) generator = MusicGenerator.from_pretrained("ace-step/base") # Generate music from MIDI audio = generator.generate_from_midi( midi_path="enhanced.mid", style="pop", mood="upbeat", duration=120 ) # Save result audio.save("complete_song.mp3") ``` **Option B: MIDI + SoundFont (No AI)** ```python import pretty_midi # Load MIDI pm = pretty_midi.PrettyMIDI("enhanced.mid") # Synthesize with high-quality SoundFont audio_data = pm.fluidsynth(fs=44100, sf2_path="path/to/good_soundfont.sf2") # Save as WAV import soundfile as sf sf.write("complete_song.wav", audio_data, 44100) ``` --- ## Usage ### Quick Start ```bash # Run the complete pipeline python ~/.openclaw/skills/hum2song/scripts/hum2song.py \ --input my_humming.wav \ --style pop \ --mood upbeat \ --output my_song.mp3 ``` ### Parameters | Parameter | Description | Options | |-----------|-------------|---------| | `--input` | Input audio file | Any audio format | | `--style` | Music style | pop, rock, jazz, classical, electronic | | `--mood` | Song mood | upbeat, calm, energetic, melancholic | | `--duration` | Target duration (seconds) | 30-300 | | `--output` | Output file path | .mp3, .wav, .mid | --- ## Response Format ### When User Provides Audio ``` 🎵 I'll convert your humming into a complete song! **Processing Pipeline:** 1️⃣ Extracting melody from your audio... 2️⃣ Converting to MIDI notation... 3️⃣ Generating full arrangement... 4️⃣ Rendering final song... ⏱️ Estimated time: 2-5 minutes **Settings:** • Style: {style} • Mood: {mood} • Duration: {duration}s Processing now... 🎧 ``` ### When Complete ``` ✅ Your song is ready! 📁 Output: {output_path} 🎼 Format: {format} ⏱️ Duration: {actual_duration}s **What was done:** • Extracted {note_count} notes from your humming • Generated {instrument_count} instrument tracks • Applied {style} style with {mood} mood 💡 You can: • Adjust style/mood and regenerate • Export as MIDI for editing in DAW • Try different AI models for variation ``` --- ## Example Interactions ### Example 1: User Hums Audio ``` User: [uploads humming audio] AI: 🎵 I'll convert your humming into a complete song! **Processing Pipeline:** 1️⃣ Extracting melody from your audio... 2️⃣ Converting to MIDI notation... 3️⃣ Generating full arrangement... 4️⃣ Rendering final song... ⏱️ Estimated time: 2-5 minutes What style would you like? • Pop (default) • Rock • Jazz • Classical • Electronic ``` ### Example 2: User Specifies Style ``` User: Turn this humming into a jazz song AI: 🎷 Jazz transformation incoming! Processing your humming with jazz style: • Swing rhythm • Walking bass line • Piano chords • Brush drums [Processing...] ✅ Your jazz song is ready! 📁 Output: ~/Music/hum2song_jazz_20240321.mp3 ``` ### Example 3: User Wants MIDI Output ``` User: Can I get the MIDI file to edit myself? AI: 🎼 Absolutely! I'll generate: • Extracted MIDI (raw melody) • Enhanced MIDI (quantized, cleaned) • Full arrangement MIDI (all instruments) All files will be in: ~/Music/hum2song_export/ ``` --- ## Technical Details ### Audio Processing **Input Formats:** WAV, MP3, M4A, FLAC, OGG **Sample Rate:** Automatically converted to 44.1kHz **Channels:** Mono/Stereo → Mono for processing ### MIDI Extraction **Model:** Basic Pitch (Spotify, ICASSP 2022) **Pitch Range:** C1 to C8 **Note Detection:** Polyphonic capable **Timing Resolution:** 10ms ### Music Generation **ACE-Step Model:** - Size: 1B parameters (base), 3B (large) - Training: Licensed music dataset - Output: 44.1kHz stereo - Latency: ~1s per second of audio on M1 Mac **SoundFont Synthesis:** - No AI required - Real-time synthesis - High-quality instrument sounds - Deterministic output --- ## Limitations - Requires local Python environment setup - ACE-Step needs ~4GB RAM for base model - Processing time: 2-5 minutes for a 2-minute song - Quality depends on humming clarity - Complex harmonies may not be fully captured --- ## Privacy & Security ✅ **All processing is local** - Your audio never leaves your machine ✅ **No cloud services** - No API keys or external uploads ✅ **Open source tools** - Basic Pitch, ACE-Step, Pretty MIDI ✅ **No data collection** - Nothing is logged or transmitted --- ## References - `basic-pitch.md` - Audio to MIDI extraction - `ace-step.md` - AI music generation - `pretty_midi.md` - MIDI processing - `librosa.md` - Audio analysis utilities --- ## Technical Information | Attribute | Value | |-----------|-------| | **Name** | Hum2Song | | **Slug** | hum2song | | **Version** | 3.0.4 | | **Category** | Audio / Music Generation | | **Tags** | music, audio, midi, ai-generation, local-processing | | **License** | MIT-0 | --- **Note:** This skill requires local setup of Python dependencies. All audio processing happens on your device for maximum privacy.

Hum2Song

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

Hum2Song

Hum2Song

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement