Android开发:如何获取WAV文件指定时刻的振幅与频率
解决Android WAV音频的时间、振幅、频率数组生成问题
嘿,我来帮你一步步搞定这个需求!要生成包含时间、振幅、频率的数组,核心是先解析WAV文件的原始采样数据,再对每秒的采样块做分析,最后循环收集结果。下面是具体的实现思路和代码示例:
1. 准备工作:解析WAV文件,获取关键参数和PCM数据
首先得读取WAV文件,解析出它的采样率(每秒采样次数)、位深(比如16位)、通道数,然后提取出原始的PCM采样数据。这里用Java文件流做基础处理:
// 读取目标WAV文件 File wavFile = new File("/storage/emulated/0/your_audio.wav"); FileInputStream fis = new FileInputStream(wavFile); ByteArrayOutputStream baos = new ByteArrayOutputStream(); byte[] buffer = new byte[1024]; int bytesRead; while ((bytesRead = fis.read(buffer)) != -1) { baos.write(buffer, 0, bytesRead); } byte[] wavData = baos.toByteArray(); fis.close(); baos.close(); // 解析WAV头部核心参数(简化版,适配常见的16位单声道WAV) // 实际开发中建议完善头部解析逻辑,兼容多声道、不同位深的情况 int sampleRate = ByteBuffer.wrap(wavData, 24, 4).order(ByteOrder.LITTLE_ENDIAN).getInt(); int bitsPerSample = ByteBuffer.wrap(wavData, 34, 2).order(ByteOrder.LITTLE_ENDIAN).getShort(); int channels = ByteBuffer.wrap(wavData, 22, 2).order(ByteOrder.LITTLE_ENDIAN).getShort(); // 提取PCM原始数据(跳过WAV头部,通常前44字节为头部) byte[] pcmData = Arrays.copyOfRange(wavData, 44, wavData.length);
2. 定义数据模型类
先创建一个实体类来存储每个时间点的三个字段,方便后续存入ArrayList:
public class AudioDataPoint { private double timestamp; // 时间(单位:秒) private double amplitude; // 振幅 private double frequency; // 主频率 public AudioDataPoint(double timestamp, double amplitude, double frequency) { this.timestamp = timestamp; this.amplitude = amplitude; this.frequency = frequency; } // 按需添加Getter方法 public double getTimestamp() { return timestamp; } public double getAmplitude() { return amplitude; } public double getFrequency() { return frequency; } }
3. 计算每秒的振幅
振幅可以用采样块内所有样本的平均绝对值(反映整体音量)或者峰值(反映最大音量),这里以平均绝对值为例:
private double calculateAmplitude(byte[] pcmChunk, int bitsPerSample) { if (bitsPerSample != 16) { // 这里仅处理16位PCM,其他位深可类似扩展逻辑 return 0; } double sum = 0; int sampleCount = pcmChunk.length / 2; // 16位每个样本占2字节 for (int i = 0; i < pcmChunk.length; i += 2) { short sample = ByteBuffer.wrap(pcmChunk, i, 2).order(ByteOrder.LITTLE_ENDIAN).getShort(); sum += Math.abs(sample); } return sum / sampleCount; // 返回平均振幅 }
4. 计算每秒的主频率(基于FFT)
要获取频率,需要把时域的采样数据转换为频域,这里用简化版的FFT(快速傅里叶变换)实现:
public class FFTUtils { public static double calculateDominantFrequency(byte[] pcmChunk, int sampleRate, int bitsPerSample) { if (bitsPerSample != 16) return 0; int sampleCount = pcmChunk.length / 2; double[] real = new double[sampleCount]; double[] imag = new double[sampleCount]; // 将PCM字节转归一化的double数组(范围-1到1) for (int i = 0; i < sampleCount; i++) { short sample = ByteBuffer.wrap(pcmChunk, i*2, 2).order(ByteOrder.LITTLE_ENDIAN).getShort(); real[i] = sample / 32768.0; // 16位样本范围是-32768到32767 imag[i] = 0; } // 执行FFT转换 fft(real, imag); // 找到幅度最大的频率点 double maxMagnitude = 0; int dominantIndex = 0; // 只看前半部分数据,避免频域镜像 for (int i = 0; i < sampleCount/2; i++) { double magnitude = Math.sqrt(real[i]*real[i] + imag[i]*imag[i]); if (magnitude > maxMagnitude) { maxMagnitude = magnitude; dominantIndex = i; } } // 计算对应频率:频率 = (索引 * 采样率) / 总采样数 return (double) dominantIndex * sampleRate / sampleCount; } // 简化版Cooley-Tukey FFT算法 private static void fft(double[] real, double[] imag) { int n = real.length; if (n == 1) return; // 拆分偶数和奇数索引的样本 double[] evenReal = new double[n/2]; double[] evenImag = new double[n/2]; double[] oddReal = new double[n/2]; double[] oddImag = new double[n/2]; for (int i = 0; i < n/2; i++) { evenReal[i] = real[2*i]; evenImag[i] = imag[2*i]; oddReal[i] = real[2*i+1]; oddImag[i] = imag[2*i+1]; } // 递归处理子数组 fft(evenReal, evenImag); fft(oddReal, oddImag); // 合并FFT结果 for (int k = 0; k < n/2; k++) { double angle = -2 * Math.PI * k / n; double cos = Math.cos(angle); double sin = Math.sin(angle); double tReal = cos * oddReal[k] - sin * oddImag[k]; double tImag = sin * oddReal[k] + cos * oddImag[k]; real[k] = evenReal[k] + tReal; imag[k] = evenImag[k] + tImag; real[k + n/2] = evenReal[k] - tReal; imag[k + n/2] = evenImag[k] - tImag; } } }
5. 循环收集数据到ArrayList
现在可以通过循环,每次处理1秒的采样数据,计算振幅和频率后添加到列表:
ArrayList<AudioDataPoint> audioDataList = new ArrayList<>(); int samplesPerSecond = sampleRate * channels; // 每秒总采样数(单声道channels=1) int bytesPerSecond = samplesPerSecond * (bitsPerSample / 8); // 每秒对应的字节数 // 计算音频总时长(秒) double totalDuration = (double) pcmData.length / bytesPerSecond; // 遍历每一秒的数据块 for (int second = 0; second < Math.ceil(totalDuration); second++) { int startByte = second * bytesPerSecond; int endByte = Math.min(startByte + bytesPerSecond, pcmData.length); byte[] secondChunk = Arrays.copyOfRange(pcmData, startByte, endByte); // 计算当前秒的振幅和频率 double amplitude = calculateAmplitude(secondChunk, bitsPerSample); double frequency = FFTUtils.calculateDominantFrequency(secondChunk, sampleRate, bitsPerSample); // 添加到数组列表 audioDataList.add(new AudioDataPoint(second, amplitude, frequency)); } // 测试输出结果 for (AudioDataPoint point : audioDataList) { Log.d("AudioAnalysis", String.format("时间: %.1fs, 振幅: %.2f, 频率: %.2fHz", point.getTimestamp(), point.getAmplitude(), point.getFrequency())); }
额外提示
- 上述WAV头部解析是简化版,生产环境建议完善逻辑,兼容多声道、不同位深的WAV文件;
- 如果对性能要求高,可以用Android NDK实现更高效的FFT,或者引入成熟的音频处理库;
- 若音频时长不是整数秒,最后一秒的采样块会自动截断,代码已处理这种边界情况。
内容的提问来源于stack exchange,提问作者ankushalg




