You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

iOS如何捕获回环/输出混音以实现音频可视化?

iOS音频流采样可视化实现指南

嘿,作为iOS开发新手,你想实现类似Android Visualizer那样捕获音频流做可视化?完全可行!不过iOS没有直接对应Android Visualizer的系统混音捕获API,得根据你的需求分场景处理,我给你梳理下具体思路:

分场景实现方案

场景1:捕获自己App内的音频输出

如果只是要可视化自己App播放的音频(比如本地音乐、网络音频),用AVAudioEngine是最直接的方案,不需要额外的特殊权限:

  • 用AVAudioEngine搭建音频播放链路,把播放器节点连接到混音节点
  • 给混音节点添加tap,就能实时获取每帧的音频缓冲区数据
  • 拿到数据后就可以做频谱分析和可视化了

场景2:捕获系统全局音频输出

如果要捕获其他App播放的声音(比如系统音乐、视频App的音频),iOS出于隐私保护,只能通过ScreenCaptureKit(iOS 15+)实现,这是目前唯一合法的方式:

  • 需要先申请屏幕录制权限,用户得在系统设置里授权
  • 通过ScreenCaptureKit创建包含音频的屏幕捕获流,提取音频样本缓冲区
  • 同样对音频数据做处理后实现可视化

音频可视化核心:FFT频谱转换

不管哪种场景,拿到音频时域数据后,都需要转成频域数据(频谱)才能做出直观的可视化效果(比如柱状频谱图)。iOS的Accelerate框架提供了高效的FFT(快速傅里叶变换)工具:

  • vDSP_fft_zrip函数把时域音频数据转换为频域数据
  • 计算幅度谱或者分贝值,用来驱动UI上的可视化元素(比如调整柱状图高度)

权限注意事项

  • 自己App内音频:不需要麦克风权限,但要正确配置音频会话(比如设置AVAudioSession.Category.playback
  • 系统全局音频:必须在Info.plist中添加NSScreenRecordingUsageDescription描述,引导用户开启屏幕录制权限

代码示例

捕获App内音频的核心代码

import AVFoundation
import Accelerate

class AppAudioVisualizer {
    private let engine = AVAudioEngine()
    private let playerNode = AVAudioPlayerNode()
    private let mixerNode = AVAudioMixerNode()
    
    func startVisualizing(with audioURL: URL) throws {
        // 加载音频文件
        let audioFile = try AVAudioFile(forReading: audioURL)
        let audioFormat = audioFile.processingFormat
        
        // 搭建音频链路
        engine.attach(playerNode)
        engine.attach(mixerNode)
        engine.connect(playerNode, to: mixerNode, format: audioFormat)
        engine.connect(mixerNode, to: engine.outputNode, format: audioFormat)
        
        // 监听音频缓冲区
        mixerNode.installTap(onBus: 0, bufferSize: 1024, format: audioFormat) { [weak self] buffer, _ in
            self?.processAudioBuffer(buffer)
        }
        
        // 启动播放
        playerNode.scheduleFile(audioFile, at: nil)
        try engine.start()
        playerNode.play()
    }
    
    private func processAudioBuffer(_ buffer: AVAudioPCMBuffer) {
        guard let channelData = buffer.floatChannelData else { return }
        let frameCount = Int(buffer.frameLength)
        let audioSamples = Array(UnsafeBufferPointer(start: channelData[0], count: frameCount))
        
        // 执行FFT转换
        let log2n = vDSP_Length(log2(Float(frameCount)))
        let fftSetup = vDSP_create_fftsetup(log2n, Int32(kFFTRadix2))
        defer { vDSP_destroy_fftsetup(fftSetup) }
        
        var realParts = [Float](repeating: 0, count: frameCount/2)
        var imagParts = [Float](repeating: 0, count: frameCount/2)
        vDSP_ctoz(UnsafePointer<DSPComplex>(OpaquePointer(channelData[0])), 2,
                  UnsafeMutablePointer<DSPComplex>(OpaquePointer(&realParts)), 1,
                  vDSP_Length(frameCount/2))
        
        vDSP_fft_zrip(fftSetup, &realParts, &imagParts, log2n, Int32(FFT_FORWARD))
        
        // 计算分贝值
        var magnitudes = [Float](repeating: 0, count: frameCount/2)
        vDSP_zvmags(&realParts, &imagParts, &magnitudes, vDSP_Length(frameCount/2))
        
        var decibels = [Float](repeating: 0, count: frameCount/2)
        vDSP_vdbcon(&magnitudes, 1, nil, &decibels, 1, vDSP_Length(frameCount/2), 0)
        
        // 主线程更新UI可视化
        DispatchQueue.main.async {
            // 这里把decibels数组传给你的UI组件,比如更新柱状图高度
        }
    }
}

捕获系统全局音频的核心代码

import ScreenCaptureKit

class SystemAudioVisualizer: NSObject, SCStreamDelegate {
    private var captureStream: SCStream?
    
    func startSystemAudioCapture() async throws {
        // 检查权限
        let contentStatus = await SCShareableContent.excludingDesktopWindows(false, onScreenWindowsOnly: true).status
        guard contentStatus == .authorized else {
            throw NSError(domain: "AudioCaptureError", code: -1, userInfo: [NSLocalizedDescriptionKey: "请在系统设置中开启屏幕录制权限"])
        }
        
        // 配置捕获流
        let shareableContent = await SCShareableContent.excludingDesktopWindows(false, onScreenWindowsOnly: true)
        guard let mainDisplay = shareableContent.displays.first else { return }
        
        let streamConfig = SCStreamConfiguration()
        streamConfig.includesAudio = true // 开启音频捕获
        streamConfig.capturesAudioFromAllDisplays = true
        
        // 创建并启动流
        captureStream = SCStream(filter: SCContentFilter(display: mainDisplay), configuration: streamConfig, delegate: self)
        try await captureStream?.startCapture()
    }
    
    func stream(_ stream: SCStream, didOutputSampleBuffer sampleBuffer: CMSampleBuffer, of type: SCStreamOutputType) {
        guard type == .audio else { return }
        
        // 提取音频数据
        guard let audioBufferList = CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(
            sampleBuffer,
            blockBufferAllocator: nil,
            blockBufferMemoryAllocator: nil,
            flags: 0,
            audioBufferListOut: nil
        ) else { return }
        
        let bufferList = UnsafeMutableAudioBufferListPointer(audioBufferList)
        for audioBuffer in bufferList {
            let sampleCount = Int(audioBuffer.mDataByteSize) / MemoryLayout<Float>.stride
            guard let audioSamples = audioBuffer.mData?.bindMemory(to: Float.self, capacity: sampleCount) else { return }
            let samplesArray = Array(UnsafeBufferPointer(start: audioSamples, count: sampleCount))
            
            // 这里复用上面的FFT处理逻辑,转换成频谱数据后更新UI
        }
    }
}

内容的提问来源于stack exchange,提问作者Jon Halliday

火山引擎 最新活动