关于FFmpeg中minterpolate配合rubberband适配音频时长及rubberband参数控制的技术问询
Hey there, let's work through your FFmpeg rubberband issue step by step — I’ve run into similar sync problems before, so I know where you’re coming from.
First, why your current setup isn’t working
The asetpts=PTS*16 you added to the audio chain is being completely ignored by the rubberband filter. Here’s why: rubberband is a time-stretching/pitch-shifting filter that recalculates audio timestamps internally based on its own tempo parameter. Any asetpts you apply before rubberband gets overwritten once the filter processes the audio.
Also, it sounds like you thought pitch was controlling duration, but that’s a misunderstanding. By default, rubberband lets you adjust pitch and tempo independently — the reason your audio length was tied to pitch before is likely because you weren’t properly setting the tempo parameter to match your video’s slowdown factor.
Fixing audio-video sync & independent pitch control
Your video is slowed down 16x (via setpts=PTS*16), so your audio needs to be stretched to 16x its original length too — that means setting tempo=0.0625 (since tempo is the inverse of duration scaling: 1/16 = 0.0625). You can set your desired pitch value independently (like 0.15) without affecting the duration.
Here’s your corrected command:
ffmpeg -ss 123.978571 -i "original-120fps.MP4" -filter_complex " [0:v]trim=0:0.375,setpts=PTS-STARTPTS,minterpolate=fps=480,setpts=PTS*16[slowv]; [0:a]atrim=0:0.375,asetpts=PTS-STARTPTS,rubberband=tempo=0.0625:pitch=0.15:mode=voice[slowa];" -y -r 60 -map [slowv] -map [slowa] -preset veryfast "output.mp4"
- I added
mode=voiceto optimize for speech (swap tomode=musicif you’re working with music tracks) - Removed the redundant
asetpts=PTS*16from the audio chain — rubberband handles timestamp scaling now tempo=0.0625ensures your audio stretches to match the 16x slowdown of your videopitch=0.15sets your desired pitch independently, no impact on duration
Understanding rubberband’s key parameters (since FFmpeg docs are sparse)
Let’s break down the most useful rubberband settings you’ll need:
pitch: Pitch scaling factor (1.0 = original pitch). Values <1 lower pitch, >1 raise it. As you noticed, going below ~0.08 results in subsonic frequencies — that’s just physics, since you’re shifting the pitch so far down human ears can’t pick up most of it.tempo: Speed/duration scaling factor (1.0 = original speed). Values <1 slow the audio (lengthen duration), >1 speed it up (shorten duration). This is the only parameter that controls audio length when using rubberband.mode: Optimizes processing for specific audio types:voice: Best for speech, prioritizes claritymusic: Optimizes for musical tracks, preserves harmoniesbalanced: Middle ground for mixed content
transients: Controls how the filter handles sharp audio peaks (like drum hits):crisp: Preserves sharp transients (great for percussive music)smooth: Softens transients (better for gentle vocals/ambient audio)
phase: Manages stereo phase coherence:laminar: Keeps phase aligned across channels (preserves stereo width)independent: Processes each channel separately (useful for uneven audio)
Tips for testing rubberband settings
Instead of testing on full video files, iterate quickly with audio-only test commands to dial in your pitch/tempo:
ffmpeg -i "original-120fps.MP4" -ss 123.978571 -t 0.375 -filter:a rubberband=tempo=0.0625:pitch=0.15 -c:a libmp3lame test_audio.mp3
This lets you listen to the audio result without waiting for video processing.
备注:内容来源于stack exchange,提问作者bossturbo




