iOS如何捕获回环/输出混音以实现音频可视化?
iOS音频流采样可视化实现指南
嘿,作为iOS开发新手,你想实现类似Android Visualizer那样捕获音频流做可视化?完全可行!不过iOS没有直接对应Android Visualizer的系统混音捕获API,得根据你的需求分场景处理,我给你梳理下具体思路:
分场景实现方案
场景1:捕获自己App内的音频输出
如果只是要可视化自己App播放的音频(比如本地音乐、网络音频),用AVAudioEngine是最直接的方案,不需要额外的特殊权限:
- 用AVAudioEngine搭建音频播放链路,把播放器节点连接到混音节点
- 给混音节点添加
tap,就能实时获取每帧的音频缓冲区数据 - 拿到数据后就可以做频谱分析和可视化了
场景2:捕获系统全局音频输出
如果要捕获其他App播放的声音(比如系统音乐、视频App的音频),iOS出于隐私保护,只能通过ScreenCaptureKit(iOS 15+)实现,这是目前唯一合法的方式:
- 需要先申请屏幕录制权限,用户得在系统设置里授权
- 通过ScreenCaptureKit创建包含音频的屏幕捕获流,提取音频样本缓冲区
- 同样对音频数据做处理后实现可视化
音频可视化核心:FFT频谱转换
不管哪种场景,拿到音频时域数据后,都需要转成频域数据(频谱)才能做出直观的可视化效果(比如柱状频谱图)。iOS的Accelerate框架提供了高效的FFT(快速傅里叶变换)工具:
- 用
vDSP_fft_zrip函数把时域音频数据转换为频域数据 - 计算幅度谱或者分贝值,用来驱动UI上的可视化元素(比如调整柱状图高度)
权限注意事项
- 自己App内音频:不需要麦克风权限,但要正确配置音频会话(比如设置
AVAudioSession.Category.playback) - 系统全局音频:必须在
Info.plist中添加NSScreenRecordingUsageDescription描述,引导用户开启屏幕录制权限
代码示例
捕获App内音频的核心代码
import AVFoundation import Accelerate class AppAudioVisualizer { private let engine = AVAudioEngine() private let playerNode = AVAudioPlayerNode() private let mixerNode = AVAudioMixerNode() func startVisualizing(with audioURL: URL) throws { // 加载音频文件 let audioFile = try AVAudioFile(forReading: audioURL) let audioFormat = audioFile.processingFormat // 搭建音频链路 engine.attach(playerNode) engine.attach(mixerNode) engine.connect(playerNode, to: mixerNode, format: audioFormat) engine.connect(mixerNode, to: engine.outputNode, format: audioFormat) // 监听音频缓冲区 mixerNode.installTap(onBus: 0, bufferSize: 1024, format: audioFormat) { [weak self] buffer, _ in self?.processAudioBuffer(buffer) } // 启动播放 playerNode.scheduleFile(audioFile, at: nil) try engine.start() playerNode.play() } private func processAudioBuffer(_ buffer: AVAudioPCMBuffer) { guard let channelData = buffer.floatChannelData else { return } let frameCount = Int(buffer.frameLength) let audioSamples = Array(UnsafeBufferPointer(start: channelData[0], count: frameCount)) // 执行FFT转换 let log2n = vDSP_Length(log2(Float(frameCount))) let fftSetup = vDSP_create_fftsetup(log2n, Int32(kFFTRadix2)) defer { vDSP_destroy_fftsetup(fftSetup) } var realParts = [Float](repeating: 0, count: frameCount/2) var imagParts = [Float](repeating: 0, count: frameCount/2) vDSP_ctoz(UnsafePointer<DSPComplex>(OpaquePointer(channelData[0])), 2, UnsafeMutablePointer<DSPComplex>(OpaquePointer(&realParts)), 1, vDSP_Length(frameCount/2)) vDSP_fft_zrip(fftSetup, &realParts, &imagParts, log2n, Int32(FFT_FORWARD)) // 计算分贝值 var magnitudes = [Float](repeating: 0, count: frameCount/2) vDSP_zvmags(&realParts, &imagParts, &magnitudes, vDSP_Length(frameCount/2)) var decibels = [Float](repeating: 0, count: frameCount/2) vDSP_vdbcon(&magnitudes, 1, nil, &decibels, 1, vDSP_Length(frameCount/2), 0) // 主线程更新UI可视化 DispatchQueue.main.async { // 这里把decibels数组传给你的UI组件,比如更新柱状图高度 } } }
捕获系统全局音频的核心代码
import ScreenCaptureKit class SystemAudioVisualizer: NSObject, SCStreamDelegate { private var captureStream: SCStream? func startSystemAudioCapture() async throws { // 检查权限 let contentStatus = await SCShareableContent.excludingDesktopWindows(false, onScreenWindowsOnly: true).status guard contentStatus == .authorized else { throw NSError(domain: "AudioCaptureError", code: -1, userInfo: [NSLocalizedDescriptionKey: "请在系统设置中开启屏幕录制权限"]) } // 配置捕获流 let shareableContent = await SCShareableContent.excludingDesktopWindows(false, onScreenWindowsOnly: true) guard let mainDisplay = shareableContent.displays.first else { return } let streamConfig = SCStreamConfiguration() streamConfig.includesAudio = true // 开启音频捕获 streamConfig.capturesAudioFromAllDisplays = true // 创建并启动流 captureStream = SCStream(filter: SCContentFilter(display: mainDisplay), configuration: streamConfig, delegate: self) try await captureStream?.startCapture() } func stream(_ stream: SCStream, didOutputSampleBuffer sampleBuffer: CMSampleBuffer, of type: SCStreamOutputType) { guard type == .audio else { return } // 提取音频数据 guard let audioBufferList = CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer( sampleBuffer, blockBufferAllocator: nil, blockBufferMemoryAllocator: nil, flags: 0, audioBufferListOut: nil ) else { return } let bufferList = UnsafeMutableAudioBufferListPointer(audioBufferList) for audioBuffer in bufferList { let sampleCount = Int(audioBuffer.mDataByteSize) / MemoryLayout<Float>.stride guard let audioSamples = audioBuffer.mData?.bindMemory(to: Float.self, capacity: sampleCount) else { return } let samplesArray = Array(UnsafeBufferPointer(start: audioSamples, count: sampleCount)) // 这里复用上面的FFT处理逻辑,转换成频谱数据后更新UI } } }
内容的提问来源于stack exchange,提问作者Jon Halliday




