You need to enable JavaScript to run this app.
导航

降噪/去混响/去啸叫-V3版本

最近更新时间2023.08.16 11:23:13

首次发布时间2023.08.16 11:23:13

简介
  • 降噪 Audio Noise Suppression(ANS)通过深度学习的方式来实现不同场景的噪声消除,比传统方式更智能、更干净地过滤噪声,并尽可能地保留人声或者音乐背景。

  • 啸叫抑制:(Howling Suppression),声源与扩音设备之间因距离过近等问题导致能量发生自激,产生啸叫。例如话筒与音箱同时使用,音响系统重放的声音能够通过空间传到话筒。SAMI利用基于深度学习的反馈抵消(Feedback Cancellation)算法来对啸叫进行抑制。

  • 去混响:(Speech Dereverberation),混响是由于房屋,障碍物反射所造成,例如在一个空旷的环境下开会,其他人接收到的声音就会有混响效果。可利用基于深度学习的去混响算法来对晚期混响进行抑制。

本文介绍的降噪/去混响/去啸叫算法均基于上述深度学习解决思路,并针对不同场景,采取精准优化措施。

效果体验
处理前处理后
降噪-语音场景模型
ans-speech-V3_input.wav
1.67MB
ans-speech-V3_output.wav
3.34MB
降噪-音乐场景模型
ans-music-v3-pre-in.mp3
319.31KB
ans-music-v3-pre-out.wav
2.23MB

去混响

reverb_input.wav
861.37KB
reverb_out.wav
861.37KB
去啸叫
de_howling_in.wav
227.75KB
de_howling_out.wav
1.82MB

技术规格
属性支持格式
采样率16000/24000/44100/48000等(内部集成了重采样)
通道1ch/2ch
数据格式Planar-Float
流式支持
实时参数更新不支持
离线/在线离线
依赖资源
算法资源说明
V3-去啸叫tcnunet_denoise_espresso_44k_howling_middle_v1.4.model44.1k中模型,啸叫抑制
V3-去混响ftgru_dereverb_espresso_44k_v1.8.model44.1k模型,去混响

V3-降噪

tcnunet_denoise_espresso_44k_music_middle_v1.6.model

44.1k中模型,音乐场景(更好的保留音乐)

tcnunet_denoise_espresso_44k_speechpro_middle_v1.3.model44.1k中模型,语音场景(消除更多非人声)
授权指引

使用离在线混合授权,详见:授权介绍

C 接口

头文件:

#include "sami_core.h"
#include "sami_core_audio_io.h" //辅助功能,音频编解码

接入步骤:

  1. 创建算法句柄

函数名:

int SAMICoreCreateHandleByIdentify( SAMICoreHandle* handle, 
                                    SAMICoreIdentify identify, 
                                    void* param);

作用:
创建算法处理的句柄,用于调用音频的处理
参数说明:

参数名参数类型参数说明
handleSAMICoreHandle*出参,用来保存句柄信息,供后面接口调用

identify

SAMICoreIdentify

入参, 用来标识需要创建什么样的算法
去啸叫:SAMICoreIdentify_EngineExecutor_CE_HOWLING
去混响:SAMICoreIdentify_EngineExecutor_CE_DEREVERB
降噪:SAMICoreIdentify_EngineExecutor_CE_DENOISE
注意:不同的功能的模型对应不同identify

paramvoid*入参,泛类型,用来传递创建算法需要的参数,不同的算法对应的类型不一样,此处的类型为SAMICoreExecutorContextCreateParameter,详见下面说明

SAMICoreExecutorContextCreateParameter

参数类型说明
sampleRateint入参,指音频的采样率
maxBlockSizeint入参, 每次输入音频的每个通道最大的采样点数,算法需要根据此字段提前分配内存等,建议接近实际的处理大小。
numChannelint入参,音频的通道数
modelBufferconst char*入参,模型的内容
modelLenint入参,模型的内容的长度
bussinessInfoconst char*入参, 表示调用的业务方信息
numAudioBufferint入参, 表示几路输入(非通道数)数据,降噪/去混响/去啸叫固定为1

configInfo

const char*

入参, json格式的字符串,填写一些扩展的参数,例如
configInfo = R"( { "utility":"CommonUtility", "enable_stereo":true, "enable_pre_delay":true } )"

configInfo

参数类型说明
utilitystring入参,固定设置为CommonUtility

enable_stereo

bool

入参,默认值:false;
表示是否仅处理单个声道; enable_stereo为ture:当处理两个通道的数据时候,两个通道单独处理; enable_stereo为false,当处理两个通道的数据时候,处理第一个通道后,拷贝结果覆盖第二个通道,节省一半计算量;

enable_pre_delay

bool

入参,默认值:false ;
算法需要送入足够的数据才会输出结果,在实时场景需要等进等出,enable_pre_delay=true,会在一开始返回静音缓冲数据,减少接入难度,建议rtc场景默认开启

返回值
0 成功,非 0 失败,具体错误码参考sami_core_error_code.h

注意:

SAMICoreExecutorContextCreateParameter需要使用 memset 进行初始化

示例:

SAMICoreHandle handle = nullptr;
SAMICoreExecutorContextCreateParameter createParameter;
memset(&createParameter, 0, sizeof(SAMICoreExecutorContextCreateParameter));
createParameter.sampleRate = sample_rate;
createParameter.maxBlockSize = pre_define_block_size;
createParameter.numChannel = num_channels;
createParameter.modelBuffer = reinterpret_cast<char*>(modelBin.data());
createParameter.modelLen = modelBin.size();
createParameter.bussinessInfo = "denoise_v3_demo";
createParameter.numAudioBuffer = 1;
createParameter.configInfo = R"( {"utility":"CommonUtility","enable_stereo":true,
        "enable_pre_delay":true} )";
int ret = SAMICoreCreateHandleByIdentify(&handle,         SAMICoreIdentify_EngineExecutor_CE_DENOISE, &createParameter);
if(ret != SAMI_OK) {
    std::cerr << "create handler failed: " << ret;
    exit(-1);
}
  1. 设置降噪强度/重置算法句柄

函数名:

int SAMICoreSetProperty(SAMICoreHandle handle, SAMICorePropertyId id,
                                         SAMICoreProperty* inAudioProperty);

作用
设置参数
参数说明:

参数名参数类型参数说明
handleSAMICoreHandle入参,上文创建的算法句柄

id

SAMICorePropertyId

入参, 参数的id;

  • 设置降噪强度:SAMICorePropertyID_Common_SetParam

  • 重置算法句柄:SAMICorePropertyID_Common_Reset

inAudioPropertySAMICoreProperty*入参,具体的参数更新内容,见下文SAMICoreProperty

SAMICoreProperty

参数名参数类型参数说明

id

SAMICorePropertyId

入参, 参数的id

  • 设置降噪强度:SAMICorePropertyID_Common_SetParam

  • 重置算法句柄:SAMICorePropertyID_Common_Reset

typeSAMICoreDataType入参,参数的类型
datavoid*入参,参数的内容
dataLenunsigned int入参,参数的长度
writableint入参,预留字段,可忽略
extraInfoconst char*入参,预留字段,可忽略

返回值
0 成功,非 0 失败,具体错误码参考sami_core_error_code.h
示例:

  • 设置降噪强度

    一些情况下可能需要实现一种比较自然的感觉,并不需要将噪音消除的很彻底,于是可以设置降噪比率。仅支持在处理数据环节(process)之前使用。

    SAMICoreProperty coreProperty;
    memset(&coreProperty, 0, sizeof(SAMICoreProperty));
    std::string algParamStr = R"( {"type":"alg_param","param": {"speech_ratio": 1.0} } )";
    coreProperty.id = SAMICorePropertyID_Common_SetParam;
    coreProperty.type = SAMICoreDataType_String;
    coreProperty.data = (void*)algParamStr.c_str();
    coreProperty.dataLen = algParamStr.length();
    coreProperty.writable = 0;
    ret = SAMICoreSetProperty(handle, SAMICorePropertyID_Common_SetParam, &coreProperty);
    if(ret != SAMI_OK) {
        std::cerr << "set property error: " << ret << std::endl;
        exit(0);
    }
    

    SAMICoreProperty::data 格式介绍

    {
        "type":"alg_param",
        "param":{
            "speech_ratio":1
        }
    }
    

    参数介绍

    参数类型说明
    speech_ratiofloat范围: 0.0 <= x <= 1.0; 默认是1.0,降噪比例最大

    注意:

    设置降噪强度,仅对降噪模型生效,去混响/去啸叫无作用

  • 重置算法句柄

    当处理完一个音频后,处理新音频重置算法句柄可以清除算法内部的状态(比如上文提到的降噪强度,需要重新设置);

    SAMICoreProperty resetProperty;
    memset(&resetProperty, 0, sizeof(SAMICoreProperty));
    resetProperty.id = SAMICorePropertyID_Common_Reset;
    resetProperty.type = SAMICoreDataType_Null;
    SAMICoreSetProperty(handle, SAMICorePropertyID_Common_Reset, &resetProperty);
    

  1. 处理数据

函数名:

int SAMICoreProcess(SAMICoreHandle handle, 
                    SAMICoreBlock* inBlock, 
                    SAMICoreBlock* outBlock);

作用:
处理音频数据并获得处理后的音频数据
参数说明:

参数类型说明
handleSAMICoreHandle入参,上文创建的算法句柄
in_blockSAMICoreBlock*入参,用来传递需要处理的音频数据
out_blockSAMICoreBlock*出参,用来传递算法返回的音频数据

SAMICoreBlock

参数类型说明
dataTypeSAMICoreDataType入参,指明下面data的类型,这里输入的in_block为SAMICoreDataType_AudioBuffer
numberAudioDataunsigned int入参,用来标识下个参数data的个数,不是字节个数,此算法填1
datavoid*入参,表示具体的参数的值,类型参考上面type,此功能为SAMICoreAudioBuffer

SAMICoreAudioBuffer

参数类型说明
numberChannelsunsigned int入参,表示数据的通道数
numberSamplesunsigned int入参,表示数据的样本个数
isInterleaveint入参,表示多通道数据是否交织存放;0: 平面(Planar),1:交织(Interleave);
datafloat**出/入参, 存放输入输出的音频数据

返回值
具体错误码参考sami_core_error_code.h

返回值含义
SAMI_OK成功
SAMI_ENGINE_INPUT_NEED_MORE_DATA输入还不够算法处理需要,继续送数据后才会有输出
其他失败

示例:

//6.init process buffer
SAMICoreAudioBuffer in_audio_buffer;
in_audio_buffer.numberChannels = num_channels;
in_audio_buffer.numberSamples = max_block_size;
in_audio_buffer.data = new float*[num_channels];
in_audio_buffer.isInterleave = 0;

SAMICoreAudioBuffer out_audio_buffer;
out_audio_buffer.numberChannels = num_channels;
out_audio_buffer.numberSamples = max_block_size;
out_audio_buffer.data = new float*[num_channels];
out_audio_buffer.isInterleave = 0;

for(int c = 0; c < int(num_channels); ++c) {
    in_audio_buffer.data[c] = new float[max_block_size];
    out_audio_buffer.data[c] = new float[max_block_size];
}

SAMICoreBlock in_block;
memset(&in_block, 0, sizeof(SAMICoreBlock));
in_block.numberAudioData = 1;
in_block.dataType = SAMICoreDataType::SAMICoreDataType_AudioBuffer;
in_block.audioData = &in_audio_buffer;

SAMICoreBlock out_block;
memset(&out_block, 0, sizeof(SAMICoreBlock));
out_block.numberAudioData = 1;
out_block.dataType = SAMICoreDataType::SAMICoreDataType_AudioBuffer;
out_block.audioData = &out_audio_buffer;

//7. process
cout << "process start" << endl;
int process_num_frame = max_block_size;
do {
    //read from file
    int real_num_frame = SAMICoreFileSourceRead(fileSource, in_f32_buffer.data(), process_num_frame);
    if (real_num_frame <= 0 ){
      break;
    }

    in_audio_buffer.numberSamples = real_num_frame;
    out_audio_buffer.numberSamples = real_num_frame;

    //Interleave to Planar
    interleaveToPlanarFloat(in_f32_buffer.data(),in_audio_buffer.data, real_num_frame, num_channels);

    //process
    ret = SAMICoreProcess(handle, &in_block, &out_block);
    if(ret != SAMI_OK) {
        if(ret == SAMI_ENGINE_INPUT_NEED_MORE_DATA) {
            cout << "SAMI_ENGINE_INPUT_NEED_MORE_DATA,input len:" << in_audio_buffer.numberSamples  << endl;
            continue;
        } else {
            res_str = "SAMICoreProcess error:" + to_string(ret) ;
            cout << res_str << endl;
            break;
        }
    } else {
        if(out_audio_buffer.numberSamples > 0) {
            int write_size = SAMICoreAudioEncoderWritePlanarData(audioEncoder, out_audio_buffer.data, num_channels, out_audio_buffer.numberSamples);
            if (write_size<out_audio_buffer.numberSamples){
              res_str = "SAMICoreAudioEncoderWritePlanarData write error";
              cout << res_str << endl;
              break;
            }
        }
    }
}while (true);
  1. 获取延时时间/获取尾部数据

函数名:

int SAMICoreGetPropertyById(SAMICoreHandle handle, SAMICorePropertyId id,
                                                 SAMICoreProperty* outAudioProperty);

作用:
获取指定参数内容
参数说明:

参数名参数类型参数说明
handleSAMICoreHandle入参,上文创建的算法句柄
idSAMICorePropertyId入参, 参数的id
outAudioPropertySAMICoreProperty*出参,具体参数内容,见下文SAMICoreProperty

SAMICoreProperty

参数名参数类型参数说明
idSAMICorePropertyId入参, 参数的id
typeSAMICoreDataType入参,获取的参数的类型
datavoid*出参, 参数的内容
dataLenunsigned int出参,参数的长度
writableint入参,预留字段,可忽略
extraInfoconst char*入参,获取参数的附加信息,见示例

返回值
0 成功,非 0 失败,具体错误码参考sami_core_error_code.h
函数名:

int SAMICoreDestroyProperty(SAMICoreProperty* parameter);

作用:
销毁获取参数的相关资源
参数说明:

参数名参数类型参数说明
parameterSAMICoreProperty*入参,获取参数获取的SAMICoreProperty

返回值
0 成功,非 0 失败,具体错误码参考sami_core_error_code.h
示例:

  • 获取延时数据:

    创建算法句柄并设置pre_enable_delay=true之后,算法会在处理音频一开始补充静音数据,通过下文调用可以获取到具体增加了多少静音的数据;在"SAMICoreCreateHandleByIdentify"函数调用之后即可使用

    SAMICoreProperty delayProperty;
    memset(&delayProperty, 0, sizeof(SAMICoreProperty));
    delayProperty.id = SAMICorePropertyID_Common_GetParam;
    delayProperty.type = SAMICoreDataType_String;
    delayProperty.writable = 1;
    delayProperty.extraInfo = R"({"type":"business_param", "param":{"name":"delay_wait_samples"} })";
    ret = SAMICoreGetPropertyById(handle, SAMICorePropertyID_Common_GetParam, &delayProperty);
    if (ret != SAMI_OK){
        cout << "get property fail,ret:" << ret << endl;
        res_str = "get property fail:"+ to_string(ret);
        return res_str;
    }else{
        cout << "get property success" << endl;
    }
    //eg:"{\"delay_wait_samples\":2717.0}",可以用json解析工具解析,详见demo
    cout << (const char*)delayProperty.data << endl;
    SAMICoreDestroyProperty(&delayProperty);
    

    SAMICoreProperty::extraInfo格式介绍

    {
        "type":"business_param",
        "param":{
            "name":"delay_wait_samples"
        }
    }
    

    参数介绍

    参数类型说明
    nameStringdelay_wait_samples表示获取算法会在开头补多少个静音的采样点;

  • 获取尾部数据:

    由于算法存在缓存部分数据,因此当音频已经全部送入算法后,需要调用这个接口,将所有缓存的数据取出;流式场景客户忽略这个接口,可以在处理数据环节之后使用;

    SAMICoreProperty flushProperty;
    memset(&flushProperty, 0, sizeof(SAMICoreProperty));
    flushProperty.type = SAMICoreDataType_AudioBuffer;
    SAMICoreGetPropertyById(handle, SAMICorePropertyID_Common_Flush, &flushProperty);
    if(flushProperty.dataLen > 0 && flushProperty.data) {
        SAMICoreAudioBuffer* bufferArray = (SAMICoreAudioBuffer*)flushProperty.data;
        if(bufferArray[0].data && bufferArray[0].numberSamples > 0) {
            //保存音频
            int write_size = SAMICoreAudioEncoderWritePlanarData(audioEncoder, bufferArray[0].data, num_channels, bufferArray[0].numberSamples);
            if (write_size<bufferArray[0].numberSamples){
              res_str = "SAMICoreAudioEncoderWritePlanarData write error";
              cout << res_str << endl;
            }
        }
    }
    SAMICoreDestroyProperty(&flushProperty);
    

  1. 销毁算法句柄

函数名:

int SAMICoreDestroyHandle(SAMICoreHandle handle);

作用:
销毁句柄
参数说明:

参数类型说明
handleSAMICoreHandle入参,上文创建的句柄

返回值
0成功,非0失败,具体错误码参考sami_core_error_code.h
示例:

ret = SAMICoreDestroyHandle(handle_);
if(ret!=SAMI_OK){
    res_str = "SAMICoreDestroyHandle error:" + to_string(ret) ;
    cout << res_str << endl;
}

完整示例

//
// Created by cmj on 2021/9/28.
//
#include <string>
#include <vector>
#include <iostream>
#include <cstring>
#include <algorithm>
#include "sami_core.h"
#include "sami_core_audio_io.h"

using namespace std;

//辅助函数
void interleaveToPlanarFloat(const float* source, float** destination, int num_samples, int channels){
    for (int i = 0; i < num_samples; ++i) {
        for (int j = 0; j < channels; ++j) {
            destination[j][i] = source[i * channels + j];
        }
    }
}

std::vector<uint8_t> loadModelAsBinary(const std::string& path) {
    std::ifstream file(path, std::ios::binary | std::ios::ate);
    if (!file.is_open()){
        return {};
    }
    std::streamsize size = file.tellg();
    file.seekg(0, std::ios::beg);
    std::vector<uint8_t> buffer(size);
    if(file.read((char*)buffer.data(), size)) { return buffer; }
    return {};
}

/**
 * @brief denoise_v3
 * @param input_file : input file
 * @param output_file : out file
 * @res_path: model path
 * @enable_pre_delay : The algorithm needs to send enough data to output the result. In the real-time scene, it needs to wait in and out.
 *                     enable_pre_delay=true, and return silent buffer data at the beginning to reduce the difficulty of access
 *                     算法需要送入足够的数据才会输出结果,在实时场景需要等进等出,enable_pre_delay=true,会在一开始返回静音缓冲数据,减少接入难度
 * @return "OK" success
 */
string denoise_v3_fun(string func_id,string input_file,string output_file,string res_path,bool enable_pre_delay){
    string res_str = "OK";
    cout << "func_id:" << func_id << endl;
    cout << "input_file:" << input_file << endl;
    cout << "output_file:" << output_file << endl;
    cout << "res_path:" << res_path << endl;

    //1. init input
    SAMICoreFileSource fileSource = nullptr;
    int ret = SAMICoreFileSourceCreate(&fileSource, input_file.data());
    if(ret != SAMI_OK) {
      res_str = "open input_file error";
      return res_str;
    }

    size_t num_channels = SAMICoreFileSourceGetNumChannel(fileSource);
    size_t sample_rate = SAMICoreFileSourceGetSampleRate(fileSource);
    cout << "ch:" << num_channels <<endl;
    cout << "sample rate:" << sample_rate << endl;

    //2.init output
    SAMICoreAudioEncoderSettings settings;
    memset(&settings, 0, sizeof(SAMICoreAudioEncoderSettings));
    settings.format = SAMICoreAudioEncoderFormat::kWav_F32;
    settings.acc = SAMICoreAudioEncoderAcceleration::kSoftware;
    settings.threading = SAMICoreAudioEncoderThreading::kSingleThreaded;
    settings.num_threads = 0;
    SAMICoreAudioEncoder audioEncoder;
    ret = SAMICoreAudioEncoderCreate(&audioEncoder, &settings);
    if(ret != SAMI_OK) {
      res_str = "SAMICoreAudioEncoderCreate error";
      SAMICoreFileSourceDestory(fileSource);
      return res_str;
    }
    ret = SAMICoreAudioEncoderOpen(audioEncoder, output_file.data(), sample_rate, num_channels, 128);
    if(ret != SAMI_OK) {
      res_str = "SAMICoreAudioEncoderOpen error";
      SAMICoreFileSourceDestory(fileSource);
      SAMICoreAudioEncoderDestory(audioEncoder);
      return res_str;
    }

    const int max_block_size = sample_rate/100; //10ms
    size_t process_num_frame = max_block_size;

    int f32_buf_num = process_num_frame * num_channels;
    vector<float> in_f32_buffer;
    in_f32_buffer.resize(f32_buf_num);

    //3. load model and init handle
    std::string model_path;
    SAMICoreIdentify identify;
    if(func_id == "DENOISE_V3_HOWLING_MIDDLE") {
        identify = SAMICoreIdentify_EngineExecutor_CE_HOWLING;
        model_path = res_path + "/model/denoise_v3/tcnunet_denoise_espresso_44k_howling_middle_v1.4.model";
    } else if(func_id == "DENOISE_V3_MUSIC_MIDDLE") {
        identify = SAMICoreIdentify_EngineExecutor_CE_DENOISE;
        model_path = res_path + "/model/denoise_v3/tcnunet_denoise_espresso_44k_music_middle_v1.6.model";
    }  else if(func_id == "DENOISE_V3_DEREVERB") {
        identify = SAMICoreIdentify_EngineExecutor_CE_DEREVERB;
        model_path = res_path + "/model/denoise_v3/ftgru_dereverb_espresso_44k_v1.8.model";
    } else if(func_id == "DENOISE_V3_SPEECHIM_PRO") {
        identify = SAMICoreIdentify_EngineExecutor_CE_DENOISE;
        model_path = res_path + "/model/denoise_v3/tcnunet_denoise_espresso_44k_speechpro_middle_v1.3.model";
    }

    std::vector<uint8_t> model_buf = loadModelAsBinary(model_path);
    if (model_buf.empty()){
        SAMICoreFileSourceDestory(fileSource);
        SAMICoreAudioEncoderClose(audioEncoder);
        SAMICoreAudioEncoderDestory(audioEncoder);
        cout << "open model file error:" << model_path << endl;
        res_str = "open model file error";
        return res_str;
    }else{
        cout << "open model success,module size:" << model_buf.size() << endl;
    }

    // create handle
    SAMICoreHandle handle = nullptr;
    SAMICoreExecutorContextCreateParameter createParameter;
    memset(&createParameter, 0, sizeof(SAMICoreExecutorContextCreateParameter));
    createParameter.sampleRate = sample_rate;
    createParameter.maxBlockSize = max_block_size;
    createParameter.numChannel = num_channels;
    createParameter.modelBuffer = reinterpret_cast<char*>(model_buf.data());
    createParameter.modelLen = model_buf.size();
    createParameter.bussinessInfo = "denoise_v3_demo";
    createParameter.numAudioBuffer = 1;
    if (enable_pre_delay){
        /*
         * enable_pre_delay : The algorithm needs to send enough data to output the result. In the real-time scene, it needs to wait in and out.
         *                    enable_pre_delay=true, and return silent buffer data at the beginning to reduce the difficulty of access
         *                    算法需要送入足够的数据才会输出结果,在实时场景需要等进等出,enable_pre_delay=true,会在一开始返回静音缓冲数据,减少接入难度
         * */
        createParameter.configInfo = R"( {"utility":"CommonUtility","enable_stereo":true,
                                                                    "enable_pre_delay":true} )";
    }else{
        createParameter.configInfo = R"( {"utility":"CommonUtility","enable_stereo":true} )";
    }
    ret = SAMICoreCreateHandleByIdentify(&handle, identify, &createParameter);
    if (ret != SAMI_OK){
        SAMICoreFileSourceDestory(fileSource);
        SAMICoreAudioEncoderClose(audioEncoder);
        SAMICoreAudioEncoderDestory(audioEncoder);
        cout << "create handle fail,ret:" << ret << endl;
        res_str = "create handle fail:"+ to_string(ret);
        return res_str;
    }else{
        cout << "create handle success" << endl;
    }

    //4.set speech_ratio
    SAMICoreProperty coreProperty;
    std::string algParamStr = R"( {"type":"alg_param","param": {"speech_ratio": 1.0} } )";
    coreProperty.id = SAMICorePropertyID_Common_SetParam;
    coreProperty.type = SAMICoreDataType_String;
    coreProperty.data = (void*)algParamStr.c_str();
    coreProperty.dataLen = algParamStr.length();
    coreProperty.writable = 0;
    ret = SAMICoreSetProperty(handle, SAMICorePropertyID_Common_SetParam, &coreProperty);
    if (ret != SAMI_OK){
        cout << "set property fail,ret:" << ret << endl;
        res_str = "set property fail:"+ to_string(ret);
        SAMICoreFileSourceDestory(fileSource);
        SAMICoreAudioEncoderClose(audioEncoder);
        SAMICoreAudioEncoderDestory(audioEncoder);
        return res_str;
    }else{
        cout << "set property success" << endl;
    }

    //5.get delay info
    if (enable_pre_delay){
        SAMICoreProperty delayProperty;
        memset(&delayProperty, 0, sizeof(SAMICoreProperty));
        delayProperty.id = SAMICorePropertyID_Common_GetParam;
        delayProperty.type = SAMICoreDataType_String;
        delayProperty.writable = 1;
        delayProperty.extraInfo = R"({"type":"business_param", "param":{"name":"delay_wait_samples"} })";
        ret = SAMICoreGetPropertyById(handle, SAMICorePropertyID_Common_GetParam, &delayProperty);
        if (ret != SAMI_OK){
          cout << "get property fail,ret:" << ret << endl;
          res_str = "get property fail:"+ to_string(ret);
          SAMICoreFileSourceDestory(fileSource);
          SAMICoreAudioEncoderClose(audioEncoder);
          SAMICoreAudioEncoderDestory(audioEncoder);
          return res_str;
        }else{
          cout << "get property success" << endl;
        }
        //eg:"{\"delay_wait_samples\":2717.0}",可以用json解析工具解析,详见demo
        cout << (const char*)delayProperty.data << endl;
        SAMICoreDestroyProperty(&delayProperty);
    }

    //6.init process buffer
    SAMICoreAudioBuffer in_audio_buffer;
    in_audio_buffer.numberChannels = num_channels;
    in_audio_buffer.numberSamples = max_block_size;
    in_audio_buffer.data = new float*[num_channels];
    in_audio_buffer.isInterleave = 0;

    SAMICoreAudioBuffer out_audio_buffer;
    out_audio_buffer.numberChannels = num_channels;
    out_audio_buffer.numberSamples = max_block_size;
    out_audio_buffer.data = new float*[num_channels];
    out_audio_buffer.isInterleave = 0;

    for(int c = 0; c < int(num_channels); ++c) {
        in_audio_buffer.data[c] = new float[max_block_size];
        out_audio_buffer.data[c] = new float[max_block_size];
    }

    SAMICoreBlock in_block;
    memset(&in_block, 0, sizeof(SAMICoreBlock));
    in_block.numberAudioData = 1;
    in_block.dataType = SAMICoreDataType::SAMICoreDataType_AudioBuffer;
    in_block.audioData = &in_audio_buffer;

    SAMICoreBlock out_block;
    memset(&out_block, 0, sizeof(SAMICoreBlock));
    out_block.numberAudioData = 1;
    out_block.dataType = SAMICoreDataType::SAMICoreDataType_AudioBuffer;
    out_block.audioData = &out_audio_buffer;

    //7. process
    cout << "process start" << endl;
    do {
        //read from file
        int real_num_frame = SAMICoreFileSourceRead(fileSource, in_f32_buffer.data(), process_num_frame);
        if (real_num_frame <= 0 ){
          break;
        }
        
        in_audio_buffer.numberSamples = real_num_frame;
        out_audio_buffer.numberSamples = real_num_frame;

        //Interleave to Planar
        interleaveToPlanarFloat(in_f32_buffer.data(),in_audio_buffer.data, real_num_frame, num_channels);

        //process
        ret = SAMICoreProcess(handle, &in_block, &out_block);
        if(ret != SAMI_OK) {
            if(ret == SAMI_ENGINE_INPUT_NEED_MORE_DATA) {
                cout << "SAMI_ENGINE_INPUT_NEED_MORE_DATA,input len:" << in_audio_buffer.numberSamples  << endl;
                continue;
            } else {
                res_str = "SAMICoreProcess error:" + to_string(ret) ;
                cout << res_str << endl;
                break;
            }
        } else {
            if(out_audio_buffer.numberSamples > 0) {
                int write_size = SAMICoreAudioEncoderWritePlanarData(audioEncoder, out_audio_buffer.data, num_channels, out_audio_buffer.numberSamples);
                if (write_size<out_audio_buffer.numberSamples){
                  res_str = "SAMICoreAudioEncoderWritePlanarData write error";
                  cout << res_str << endl;
                  break;
                }
            }
        }

   }while (true);

    //8. flush
    if (!enable_pre_delay){
        SAMICoreProperty flushProperty;
        memset(&flushProperty, 0, sizeof(SAMICoreProperty));
        flushProperty.type = SAMICoreDataType_AudioBuffer;
        SAMICoreGetPropertyById(handle, SAMICorePropertyID_Common_Flush, &flushProperty);
        if(flushProperty.dataLen > 0 && flushProperty.data) {
            SAMICoreAudioBuffer* bufferArray = (SAMICoreAudioBuffer*)flushProperty.data;
            if(bufferArray[0].data && bufferArray[0].numberSamples > 0) {
                int write_size = SAMICoreAudioEncoderWritePlanarData(audioEncoder, bufferArray[0].data, num_channels, bufferArray[0].numberSamples);
                if (write_size<out_audio_buffer.numberSamples){
                  res_str = "SAMICoreAudioEncoderWritePlanarData write error";
                  cout << res_str << endl;
                }
            }
        }
        SAMICoreDestroyProperty(&flushProperty);
    }

    //9. release
    cout << "release" << endl;
    SAMICoreFileSourceDestory(fileSource);
    SAMICoreAudioEncoderClose(audioEncoder);
    SAMICoreAudioEncoderDestory(audioEncoder);

    for(int c = 0; c < int(num_channels); ++c) {
        delete[] in_audio_buffer.data[c];
        delete[] out_audio_buffer.data[c];
    }

    delete[] in_audio_buffer.data;
    delete[] out_audio_buffer.data;
    cout << "release buffer success" << endl;
    ret = SAMICoreDestroyHandle(handle);
    handle = nullptr;
    if(ret!=SAMI_OK){
        res_str = "SAMICoreDestroyHandle error:" + to_string(ret) ;
        cout << res_str << endl;
    }

    return res_str;
}

Java 接口

引入的类:

import com.mammon.audiosdk.SAMICore;
import com.mammon.audiosdk.SAMICoreCode;
import com.mammon.audiosdk.enums.SAMICoreDataType;
import com.mammon.audiosdk.enums.SAMICoreIdentify;
import com.mammon.audiosdk.enums.SAMICorePropertyId;
import com.mammon.audiosdk.structures.SAMICoreAudioBuffer;
import com.mammon.audiosdk.structures.SAMICoreAudioEncoderSettings;
import com.mammon.audiosdk.structures.SAMICoreBlock;
import com.mammon.audiosdk.structures.SAMICoreDebugConfig;
import com.mammon.audiosdk.structures.SAMICoreExecutorContextCreateParameter;
import com.mammon.audiosdk.structures.SAMICoreProperty;
//辅助功能,音频编解码
import com.mammon.audiosdk.SAMICoreIo;
import com.mammon.audiosdk.enums.SAMICoreAudioEncoderAcceleration;
import com.mammon.audiosdk.enums.SAMICoreAudioEncoderFormat;
import com.mammon.audiosdk.enums.SAMICoreAudioEncoderThreading;

接入步骤:

  1. 创建算法句柄

函数名:

class SAMICore{
    public int SAMICoreCreateHandleByIdentify(SAMICoreIdentify identify, Object param);
}

作用:
创建算法句柄
参数说明:

参数名参数类型参数说明

identify

SAMICoreIdentify

入参, 用来标识需要创建什么样的算法
去啸叫:SAMICoreIdentify_EngineExecutor_CE_HOWLING
去混响:SAMICoreIdentify_EngineExecutor_CE_DEREVERB
降噪:SAMICoreIdentify_EngineExecutor_CE_DENOISE
注意:不同的功能的模型对应不同identify

param

Object

入参,泛类型,用来传递创建算法需要的参数,不同的算法对应的类型不一样,此处的的类型为SAMICoreExecutorContextCreateParameter,详见下面说明

SAMICoreExecutorContextCreateParameter

参数类型说明
sampleRateint入参,指音频的采样率
maxBlockSizeint入参,每次输入音频的每个通道最大的采样点数,算法需要根据此字段提前分配内存等,建议接近实际的处理大小。
numChannelint入参,音频的通道数
modelBufferbyte[]入参,模型的内容
modelLenint入参,模型的内容的长度
bussinessInfoString入参, 表示调用的业务方信息
numberAudioDataint入参, 表示几路输入(非通道数)数据,降噪/去混响/去啸叫固定为1,默认值为1

configInfo

String

入参, json格式的字符串,填写一些扩展的参数,例如
configInfo = R"( { "utility":"CommonUtility", "enable_stereo":true, "enable_pre_delay":true } )"

configInfo

参数类型说明
utilitystring入参,固定设置为CommonUtility

enable_stereo

bool

入参,默认值:false
当处理两个通道的数据时候,enable_stereo为ture,两个通道单独处理;enable_stereo为false,处理第一个通道后,拷贝结果覆盖第二个通道,节省一半计算量;

enable_pre_delay

bool

入参,默认值:false
算法需要送入足够的数据才会输出结果,在实时场景需要等进等出,enable_pre_delay=true,会在一开始返回静音缓冲数据,减少接入难度,建议rtc场景默认开启

返回值
具体错误码参考 com.mammon.audiosdk.SAMICoreCode

返回值含义
SAMI_OK成功
其他失败

举例:

//1. 初始化句柄 create handle
//The identify and the model must match. Visit the https://www.volcengine.com/docs/6489/192740 for the correspondence
String modelFileName = "";
SAMICoreIdentify identify = null;
if (function == Function.ANS_SPEECH){
    modelFileName = "model/denoise_v3/tcnunet_denoise_espresso_44k_speechpro_middle_v1.3.model";
    identify = SAMICoreIdentify.SAMICoreIdentify_EngineExecutor_CE_DENOISE;
}else if(function == Function.ANS_MUSIC){
    modelFileName = "model/denoise_v3/tcnunet_denoise_espresso_44k_music_middle_v1.6.model";
    identify = SAMICoreIdentify.SAMICoreIdentify_EngineExecutor_CE_DENOISE;
}else if (function == Function.DEREVERB){
    modelFileName = "model/denoise_v3/ftgru_dereverb_espresso_44k_v1.8.model";
    identify = SAMICoreIdentify.SAMICoreIdentify_EngineExecutor_CE_DEREVERB;
}else if (function == Function.DEHOWLING){
    modelFileName = "model/denoise_v3/tcnunet_denoise_espresso_44k_howling_middle_v1.4.model";
    identify = SAMICoreIdentify.SAMICoreIdentify_EngineExecutor_CE_HOWLING;
}

samiCore = new SAMICore();
SAMICoreExecutorContextCreateParameter parameter = new SAMICoreExecutorContextCreateParameter();
parameter.sampleRate = sample_rate;
parameter.numChannel = num_channel;
parameter.maxBlockSize = maxBlockSize;
parameter.modelBuffer = FunctionHelper.readBinaryFile(modelFileName,context);
parameter.modelLen = parameter.modelBuffer.length;
parameter.bussinessInfo= "sami audio demo";
parameter.numberAudioData = 1;
if (enablePreDelay){
    parameter.configInfo="{\"utility\":\"CommonUtility\"," +
                                    "\"enable_stereo\":true" +
                                    "\"enable_pre_delay\":true\"}";
}else{
    parameter.configInfo="{\"utility\":\"CommonUtility\",\"enable_stereo\":true}";
}
int ret = samiCore.SAMICoreCreateHandleByIdentify(identify, parameter);
if (ret != SAMICoreCode.SAMI_OK) {
    return "SAMICoreCreateHandleByIdentify error:"+ret;
}
  1. 设置降噪强度/重置算法句柄

函数名:

calss SAMICore{
    public int SAMICoreSetProperty(SAMICorePropertyId id, SAMICoreProperty inAudioProperty);
}

作用:
设置参数
参数说明:

参数名参数类型参数说明
idSAMICorePropertyId入参, 参数的id
inAudioPropertySAMICoreProperty入参,具体的参数更新内容,见下文SAMICoreProperty

SAMICoreProperty

参数名参数类型参数说明
idSAMICorePropertyId入参, 参数的id
typeSAMICoreDataType入参,参数的类型
dataObjectArrayObject[]入参,参数的内容
dataArrayLenint入参,参数内容对象的个数
dataByteArraybyte[]入参,预留字段,可忽略
writableint入参,预留字段,可忽略
extraInfoconst char*入参,预留字段,可忽略

返回值
具体错误码参考 com.mammon.audiosdk.SAMICoreCode

返回值含义
SAMI_OK成功
其他失败

举例:

  • 设置降噪强度

    一些情况下可能需要实现一种比较自然的感觉,并不需要将噪音消除的很彻底,可以通过接口设置降噪比率。仅在处理数据环节之前使用。

public int setSpeechRatio(float data){
  SAMICoreProperty samiCoreProperty = new SAMICoreProperty();
  samiCoreProperty.id = SAMICorePropertyId.SAMICorePropertyID_Common_SetParam;
  samiCoreProperty.type = SAMICoreDataType.SAMICoreDataType_String;
  samiCoreProperty.dataObjectArray = new Object[1];
  String rateStr="{\"type\":\"alg_param\",\"param\":{\"speech_ratio\":"+Float.toString(data)+"}}";
  samiCoreProperty.dataObjectArray[0] = rateStr;
  samiCoreProperty.dataArrayLen = 1;
  int ret = samiCore.SAMICoreSetProperty(SAMICorePropertyId.SAMICorePropertyID_Common_SetParam,samiCoreProperty);
  if (ret != SAMICoreCode.SAMI_OK) {
    System.out.println("SAMICoreSetProperty failed, ret " + ret);
    return ret;
  }
  return SAMICoreCode.SAMI_OK;
}

SAMICoreProperty::dataObjectArray[0] 格式介绍

{
    "type":"alg_param",
    "param":{
        "speech_ratio":1
    }
}

参数介绍

参数类型说明
speech_ratiofloat范围: 0.0 <= x <= 1.0; 默认是1.0,降噪比例最大

注意:

设置降噪强度,仅对降噪模型生效,去混响/去啸叫无作用

  • 重置算法

    当处理完一个音频后,处理新音频重置算法句柄可以清除算法内部的状态

SAMICoreProperty resetProperty = new SAMICoreProperty();
resetProperty.id = SAMICorePropertyId.SAMICorePropertyID_Common_Reset;
resetProperty.type = SAMICoreDataType.SAMICoreDataType_Null;
int ret = samiCore.SAMICoreSetProperty(SAMICorePropertyId.SAMICorePropertyID_Common_Reset,samiCoreProperty);
if (ret != SAMICoreCode.SAMI_OK) {
  System.out.println("SAMICoreSetProperty failed, ret " + ret);
  return ret;
}

  1. 处理数据

class SAMICore{
    public int SAMICoreProcess(SAMICoreBlock inBlock, SAMICoreBlock outBlock)
}

作用:
处理音频数据并获得处理后的音频数据
参数说明:

参数类型说明
in_blockSAMICoreBlock入参,用来传递需要处理的音频数据
out_blockSAMICoreBlock出****参,用来传递算法返回的音频数据

SAMICoreBlock

参数类型说明
dataTypeSAMICoreDataType入参,指明下面data的类型,这里输入的in_block为SAMICoreDataType.SAMICoreDataType_AudioBuffer
audioDataObject[]入参,表示具体的参数的值,类型参考上面type,解析的时候按照type进行类型转换
numberAudioDataint入参,表示有多少个audioData,此算法设置为1

SAMICoreAudioBuffer

参数类型说明
numberChannelsint入参,表示数据的通道数
numberSamplesint入参,表示数据的样本个数
isInterleaveint入参,表示多通道是否交织存放;仅支持planner格式,设置为0;
datafloat[][]出/入参, 表示算法输入和输出的音频数据

返回值
具体错误码参考 com.mammon.audiosdk.SAMICoreCode

返回值含义
SAMI_OK成功
SAMI_ENGINE_INPUT_NEED_MORE_DATA输入还不够算法处理需要,继续送数据后才会有输出
其他失败

举例:

private SAMICore samiCore;
private SAMICoreBlock inBlock;
private SAMICoreBlock outBlock;
private SAMICoreAudioBuffer inAudioBuffer;
private SAMICoreAudioBuffer outAudioBuffer;

//7.初始化处理需要的缓冲区;init buffer
inAudioBuffer =  new SAMICoreAudioBuffer();;
inAudioBuffer.numberChannels = num_channel;
inAudioBuffer.numberSamples = maxBlockSize;
inAudioBuffer.data = new float[num_channel][maxBlockSize];

outAudioBuffer = new SAMICoreAudioBuffer();
outAudioBuffer.numberChannels = num_channel;
outAudioBuffer.numberSamples = maxBlockSize;
outAudioBuffer.data = new float[num_channel][maxBlockSize];

inBlock = new SAMICoreBlock();
inBlock.dataType = SAMICoreDataType.SAMICoreDataType_AudioBuffer;
inBlock.audioData = new SAMICoreAudioBuffer[1];
inBlock.audioData[0] = inAudioBuffer;

outBlock = new SAMICoreBlock();
outBlock.dataType = SAMICoreDataType.SAMICoreDataType_AudioBuffer;
outBlock.audioData = new SAMICoreAudioBuffer[1];
outBlock.audioData[0] = outAudioBuffer;

float[] f32_buffer_interleave_in = new float[ maxBlockSize * num_channel];

boolean run_flag = true;
boolean done = false;
int pre_ch_read_frame = maxBlockSize; //每次处理的真实采样点数
do{
    int readed_frame = io.FileSourceRead(f32_buffer_interleave_in, pre_ch_read_frame);
    if(readed_frame<=0){
        Log.e(TAG,"read file end");
        done = true;
        break;
    } else{
        //float[left,right,left,right,***] -> float[left,left,***][right,right,***]
        boolean result = FunctionHelper.interleaveToPlanarFloat(f32_buffer_interleave_in,inAudioBuffer.data);
        if (!result){
            str_ret = "interleaveToPlanarFloat error:"+result;
            Log.e(TAG, str_ret);
            break;
        }

        inAudioBuffer.numberSamples = readed_frame;
        outAudioBuffer.numberSamples = readed_frame;

        ret = samiCore.SAMICoreProcess(inBlock, outBlock);
        if (ret != SAMICoreCode.SAMI_OK){
            if (ret == SAMICoreCode.SAMI_ENGINE_INPUT_NEED_MORE_DATA) continue;
            str_ret = "process error:"+ret;
            Log.e(TAG, str_ret);
            break;
        }

        if(outAudioBuffer.numberSamples > 0) {
            long write_num = io.AudioEncoderWritePlanarData(outAudioBuffer.data, num_channel, outAudioBuffer.numberSamples);
        }
    }
}while (run_flag);
  1. 获取延时时间/获取尾部数据

函数名:

class SamiCore{
    public int SAMICoreGetPropertyById(SAMICorePropertyId id, SAMICoreProperty property);
}

作用:
获取指定参数内容
参数说明:

参数名参数类型参数说明
idSAMICorePropertyId入参, 参数的id
propertySAMICoreProperty出参,具体参数内容,见下文SAMICoreProperty

SAMICoreProperty

参数名参数类型参数说明
idSAMICorePropertyId入参, 参数的id
typeSAMICoreDataType入参,参数的类型
dataObjectArrayObject[]入参,参数的内容
dataArrayLenint入参,参数内容对象的个数
dataByteArraybyte[]入参,预留字段,可忽略
writableint入参,预留字段,可忽略
extraInfoconst char*入参, 获取参数的附加信息,见示例

返回值
具体错误码参考 com.mammon.audiosdk.SAMICoreCode

返回值含义
SAMI_OK成功
其他失败

示例:

  • 获取延时数据:

    创建算法句柄并设置pre_enable_delay=true之后,算法会在处理音频一开始补充静音数据,通过下文调用可以获取到具体增加了多少静音的数据;在"SAMICoreCreateHandleByIdentify"函数调用之后即可使用

    SAMICoreProperty delayProperty = new SAMICoreProperty();
    delayProperty.id = SAMICorePropertyId.SAMICorePropertyID_Common_GetParam;
    delayProperty.type = SAMICoreDataType.SAMICoreDataType_String;
    delayProperty.extraInfo = "{\"type\":\"business_param\", \"param\":{\"name\":\"delay_wait_samples\"} }";
    int ret = samiCore.SAMICoreGetPropertyById(SAMICorePropertyId.SAMICorePropertyID_Common_GetParam,delayProperty);
    if(ret != SAMICoreCode.SAMI_OK) {
      System.out.println("SAMICoreGetProperty failed, ret " + ret);
      return ret;
    }
    //eg:"{\"delay_wait_samples\":2717.0}",可以用json解析工具解析,详见demo
    System.out.println(delayProperty.dataObjectArray[0].toString());
    return SAMICoreCode.SAMI_OK;
    

    SAMICoreProperty::extraInfo格式介绍

    {
        "type":"business_param",
        "param":{
            "name":"delay_wait_samples"
        }
    }
    

    参数介绍

    参数类型说明
    nameStringdelay_wait_samples表示获取算法会在开头补多少个静音的采样点;
  • 获取尾部数据:

    由于算法存在缓存部分数据,因此当音频已经全部送入算法后,需要调用这个接口,将所有缓存的数据取出;流式场景客户忽略这个接口,可以在处理数据环节之后使用;

    //辅助函数
    public float[][] flush() {
      SAMICoreProperty flushProperty = new SAMICoreProperty();
      flushProperty.id = SAMICorePropertyId.SAMICorePropertyID_Common_Flush;
      flushProperty.type = SAMICoreDataType.SAMICoreDataType_AudioBuffer;
      int ret = samiCore.SAMICoreGetPropertyById(SAMICorePropertyId.SAMICorePropertyID_Common_Flush,flushProperty);
      if(ret != SAMICoreCode.SAMI_OK) {
        return null;
      }
      SAMICoreAudioBuffer[] outAudioBufferArray = (SAMICoreAudioBuffer[])flushProperty.dataObjectArray;
      return outAudioBufferArray[0].data;
    }
    //调用
    //flushData[0] 表示第一个通道的数据
    //flushData[0].length 表示第一个通道的采样点数
    //flushData[1] 表示第二个通道的数据
    //flushData[1].length 表示第二个通道的采样点数
    float[][] flushData = samiCoreDenoiseV3.flush();
    
  1. 销毁算法句柄

函数名:

class SamiCore{
    public int SAMICoreDestroyHandle()
}

作用:销毁句柄
返回值
具体错误码参考 com.mammon.audiosdk.SAMICoreCode

返回值含义
SAMI_OK成功
其他失败

举例:

samiCore.SAMICoreDestroyHandle();

完整示例

辅助函数:

public class FunctionHelper {
    
    public static byte[] readBinaryFile(String fileName,Context context) {
        Context appContext = context;
        byte[] fileContentBuf = null;
        try {
            BufferedInputStream inputStream = new BufferedInputStream(appContext.getResources().getAssets().open(fileName));
            fileContentBuf = new byte[inputStream.available()];
            int currentSize = 0;
            byte[] buffer = new byte[1024];
            int readLen = 0;
            while ((readLen = inputStream.read(buffer)) != -1) {
                System.arraycopy(buffer, 0, fileContentBuf, currentSize, readLen);
                currentSize += readLen;
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
        return fileContentBuf;
    }
    
    public static boolean interleaveToPlanarFloat(float[] interleave,float[][] planar) {
        int channels = planar.length;
        int numSamples = planar[0].length;
    
        if (interleave.length > (channels*numSamples)){
            return false;
        }
    
        for (int i = 0; i < numSamples; i++) {
            for (int j = 0; j < channels; j++) {
                planar[j][i] = interleave[i * channels + j];
            }
        }
        return true;
    }
}

实现:
实现了读取文件经过算法处理后保存文件的过程,见DemoFileToFileRun函数

public class DeNoiseDeReverbDeHowlingV3Demo {
    private static String TAG = "DenoiseV3Demo";
    private Context context;

    //run flag
    private boolean run_flag = false;
    public void stop_run(){
        run_flag = false;
    }

    //dump
    private boolean enableDumpFile = false;
    private String dumpPath = "/sdcard/Music/";
    private String fileNamePrefix = "DenoiseV3Demo";
    public int setEnableDumpFileFlag(boolean enable,String dumpPath,String fileNamePrefix){
        this.enableDumpFile = enable;
        this.dumpPath = dumpPath;
        this.fileNamePrefix = fileNamePrefix;
        return 0;
    }

    //sdk
    public enum Function{
        ANS_SPEECH,
        ANS_MUSIC,
        DEREVERB,
        DEHOWLING
    }

    private SAMICore samiCore;
    private SAMICoreBlock inBlock;
    private SAMICoreBlock outBlock;
    private SAMICoreAudioBuffer inAudioBuffer;
    private SAMICoreAudioBuffer outAudioBuffer;

    private String initHandle(int sample_rate,int num_channel,int maxBlockSize,boolean enablePreDelay,Function function){
        String str_ret = "OK";

        //1. 初始化句柄 create handle
        //The identify and the model must match. Visit the https://www.volcengine.com/docs/6489/192740 for the correspondence
        String modelFileName = "";
        SAMICoreIdentify identify = null;
        if (function == Function.ANS_SPEECH){
            modelFileName = "model/denoise_v3/tcnunet_denoise_espresso_44k_speechpro_middle_v1.3.model";
            identify = SAMICoreIdentify.SAMICoreIdentify_EngineExecutor_CE_DENOISE;
        }else if(function == Function.ANS_MUSIC){
            modelFileName = "model/denoise_v3/tcnunet_denoise_espresso_44k_music_middle_v1.6.model";
            identify = SAMICoreIdentify.SAMICoreIdentify_EngineExecutor_CE_DENOISE;
        }else if (function == Function.DEREVERB){
            modelFileName = "model/denoise_v3/ftgru_dereverb_espresso_44k_v1.8.model";
            identify = SAMICoreIdentify.SAMICoreIdentify_EngineExecutor_CE_DEREVERB;
        }else if (function == Function.DEHOWLING){
            modelFileName = "model/denoise_v3/tcnunet_denoise_espresso_44k_howling_middle_v1.4.model";
            identify = SAMICoreIdentify.SAMICoreIdentify_EngineExecutor_CE_HOWLING;
        }

        samiCore = new SAMICore();
        SAMICoreExecutorContextCreateParameter parameter = new SAMICoreExecutorContextCreateParameter();
        parameter.sampleRate = sample_rate;
        parameter.numChannel = num_channel;
        parameter.maxBlockSize = maxBlockSize;
        parameter.modelBuffer = FunctionHelper.readBinaryFile(modelFileName,context);
        parameter.modelLen = parameter.modelBuffer.length;
        parameter.bussinessInfo= "sami audio demo";
        parameter.numberAudioData = 1;
        if (enablePreDelay){
            parameter.configInfo="{\"utility\":\"CommonUtility\"," +
                                            "\"enable_stereo\":true" +
                                            "\"enable_pre_delay\":true\"}";
        }else{
            parameter.configInfo="{\"utility\":\"CommonUtility\",\"enable_stereo\":true}";
        }
        int ret = samiCore.SAMICoreCreateHandleByIdentify(identify, parameter);
        if (ret != SAMICoreCode.SAMI_OK) {
            return "SAMICoreCreateHandleByIdentify error:"+ret;
        }

        //2.设置降噪比例,设置1降噪力度最大;Set the noise reduction ratio.Setting 1 is the most powerful noise reduction
        float speech_ratio = 1.0F;
        SAMICoreProperty samiCoreProperty = new SAMICoreProperty();
        samiCoreProperty.id = SAMICorePropertyId.SAMICorePropertyID_Common_SetParam;
        samiCoreProperty.type = SAMICoreDataType.SAMICoreDataType_String;
        samiCoreProperty.dataObjectArray = new Object[1];
        String rateStr="{\"type\":\"alg_param\",\"param\":{\"speech_ratio\":"+Float.toString(speech_ratio)+"}}";
        samiCoreProperty.dataObjectArray[0] = rateStr;
        samiCoreProperty.dataArrayLen = 1;
        ret = samiCore.SAMICoreSetProperty(SAMICorePropertyId.SAMICorePropertyID_Common_SetParam,samiCoreProperty);
        if (ret != SAMICoreCode.SAMI_OK) {
            samiCore.SAMICoreDestroyHandle();
            return "SAMICoreSetProperty failed, ret " + ret;
        }

        //3.获取音频经过算法滞后的采样点;Obtain the number of sampling points for the delay introduced by the algorithm
        SAMICoreProperty delayProperty = new SAMICoreProperty();
        delayProperty.id = SAMICorePropertyId.SAMICorePropertyID_Common_GetParam;
        delayProperty.type = SAMICoreDataType.SAMICoreDataType_String;
        delayProperty.extraInfo = "{\"type\":\"business_param\", \"param\":{\"name\":\"delay_wait_samples\"} }";
        ret = samiCore.SAMICoreGetPropertyById(SAMICorePropertyId.SAMICorePropertyID_Common_GetParam,delayProperty);
        if(ret != SAMICoreCode.SAMI_OK) {
            samiCore.SAMICoreDestroyHandle();
            return "SAMICoreGetProperty failed, ret" + ret;
        }
        //eg:"{\"delay_wait_samples\":2717.0}",可以用json解析工具解析,详见demo
        Log.i(TAG,"delay_wait_samples:"+delayProperty.dataObjectArray[0].toString());

        //4.初始化处理需要的缓冲区;init buffer
        inAudioBuffer =  new SAMICoreAudioBuffer();;
        inAudioBuffer.numberChannels = num_channel;
        inAudioBuffer.numberSamples = maxBlockSize;
        inAudioBuffer.data = new float[num_channel][maxBlockSize];

        outAudioBuffer = new SAMICoreAudioBuffer();
        outAudioBuffer.numberChannels = num_channel;
        outAudioBuffer.numberSamples = maxBlockSize;
        outAudioBuffer.data = new float[num_channel][maxBlockSize];

        inBlock = new SAMICoreBlock();
        inBlock.dataType = SAMICoreDataType.SAMICoreDataType_AudioBuffer;
        inBlock.audioData = new SAMICoreAudioBuffer[1];
        inBlock.audioData[0] = inAudioBuffer;

        outBlock = new SAMICoreBlock();
        outBlock.dataType = SAMICoreDataType.SAMICoreDataType_AudioBuffer;
        outBlock.audioData = new SAMICoreAudioBuffer[1];
        outBlock.audioData[0] = outAudioBuffer;
        return "OK";
    }

    private void releaseHandle(){
        if (enableDumpFile){
            samiCore.SAMICoreReleaseDebugConfig();
        }
        samiCore.SAMICoreDestroyHandle();
    }

    public String DemoFileToFileRun(Context context,String inputFile, String outputFile,Function function){
        String str_ret = "OK";
        this.context = context;

        int maxBlockSize = 512;
        boolean enablePreDelay = false;

        //1. 初始化输入文件 init decoder
        SAMICoreIo io = new SAMICoreIo();
        int ret = io.FileSourceCreate(inputFile);
        if(ret != 0) {
            return "FileSourceCreate error:"+inputFile;
        }
        int num_channel = (int)io.FileSourceGetNumChannel();
        int sample_rate = (int)io.FileSourceGetSampleRate();

        //2. 创建处理器和相关buffer; init handle and buffer
        str_ret = initHandle(sample_rate,num_channel,maxBlockSize,enablePreDelay,function);
        if (!str_ret.equals("OK")){
            return str_ret;
        }

        //3. 初始化输出文件 init encoder
        SAMICoreAudioEncoderSettings settings = new SAMICoreAudioEncoderSettings();
        settings.format = SAMICoreAudioEncoderFormat.KWav_F32;
        settings.acc = SAMICoreAudioEncoderAcceleration.KSoftware;
        settings.threading = SAMICoreAudioEncoderThreading.KSingleThreaded;
        settings.num_threads = 0;
        ret = io.AudioEncoderCreate(settings);
        if(ret != 0) {
            return "AudioEncoderCreate error";
        }
        ret = io.AudioEncoderOpen(outputFile, sample_rate, num_channel, 64);
        if(ret != 0) {
            return "AudioEncoderOpen error";
        }

        float[] f32_buffer_interleave_in = new float[ maxBlockSize * num_channel];

        //4.处理音频; process audio buffer
        run_flag = true;
        boolean done = false;
        int pre_ch_read_frame = maxBlockSize; //每次处理的真实采样点
        do{
            int readed_frame = io.FileSourceRead(f32_buffer_interleave_in, pre_ch_read_frame);
            if(readed_frame<=0){
                Log.e(TAG,"read file end");
                done = true;
                break;
            } else{
                //float[left,right,left,right,***] -> float[left,left,***][right,right,***]
                boolean result = FunctionHelper.interleaveToPlanarFloat(f32_buffer_interleave_in,inAudioBuffer.data);
                if (!result){
                    str_ret = "interleaveToPlanarFloat error:"+result;
                    Log.e(TAG, str_ret);
                    break;
                }

                inAudioBuffer.numberSamples = readed_frame;
                outAudioBuffer.numberSamples = readed_frame;

                ret = samiCore.SAMICoreProcess(inBlock, outBlock);
                if (ret != SAMICoreCode.SAMI_OK){
                    if (ret == SAMICoreCode.SAMI_ENGINE_INPUT_NEED_MORE_DATA) continue;
                    str_ret = "process error:"+ret;
                    Log.e(TAG, str_ret);
                    break;
                }

                if(outAudioBuffer.numberSamples > 0) {
                    long write_num = io.AudioEncoderWritePlanarData(outAudioBuffer.data, num_channel, outAudioBuffer.numberSamples);
                }
            }
        }while (run_flag);

        //5.导出处理器内部剩余的数据;Export the remaining data inside the processor
        if (done && enablePreDelay==false){
            SAMICoreProperty flushProperty = new SAMICoreProperty();
            flushProperty.id = SAMICorePropertyId.SAMICorePropertyID_Common_Flush;
            flushProperty.type = SAMICoreDataType.SAMICoreDataType_AudioBuffer;
            ret = samiCore.SAMICoreGetPropertyById(SAMICorePropertyId.SAMICorePropertyID_Common_Flush,flushProperty);
            if(ret != SAMICoreCode.SAMI_OK) {
                str_ret = "flush error,ret:"+ret;
            }else{
                SAMICoreAudioBuffer[] outAudioBufferArray = (SAMICoreAudioBuffer[])flushProperty.dataObjectArray;
                long write_num = io.AudioEncoderWritePlanarData(outAudioBufferArray[0].data, num_channel, outAudioBufferArray[0].data.length);
                str_ret = "OK";
            }
        }else{
            if (!str_ret.equals("OK")){
                str_ret = "break";
            }
        }

        //6.释放资源;release
        io.AudioEncoderClose();
        releaseHandle();
        return str_ret;
    }

}
Objective-C 接口

头文件:

#import "SAMICore.h"     //核心功能
#import "SAMICoreFileSource.h"  //辅助功能,文件解码
#import "SAMICoreAudioEncoder.h"    //辅助功能,文件编码

接入步骤:

  1. 创建算法句柄

函数名:

@interface SAMICore: NSObject
- (_Nullable instancetype)initWithIdentify:(SAMICore_Identify)identify
                                     param:(id _Nullable)param
                                    result:(int* _Nullable)result;
@end

作用:
创建算法句柄
参数说明:

参数类型说明

identify

SAMICore_Identify

入参:用来标识需要创建什么样的算法
去啸叫:SAMICore_Identify_EngineExecutor_CE_HOWLING
去混响:SAMICore_Identify_EngineExecutor_CE_DEREVERB
降噪:SAMICore_Identify_EngineExecutor_CE_DENOISE
注意:不同的功能的模型对应不同identify

paramid入参:入参,泛类型,用来传递创建算法需要的参数,不同的算法对应的类型不一样,此处的的类型为SAMICore_CreateParameter,详见下面说明
resultint*出参:处理结果,用户一些错误状态的回调,0为成功,其他错误码参考SAMICoreCode.h

SAMICore_CreateParameter

参数类型说明
sampleRateint入参,指音频的采样率

maxBlockSize

int

入参,每次输入音频的每个通道最大的采样点数,算法需要根据此字段提前分配内存等,建议接近实际的处理大小。

modelBufferchar*入参,模型的内容
modelLenint入参,模型的内容的长度
numChannelint入参,音频的通道数
numAudioBufferint入参, 表示几路输入(非通道数)数据,降噪/去混响/去啸叫固定为1,默认值为1
jsonConfigchar*入参,扩展字段,暂未使用
bussinessInfoString入参, 表示调用的业务方信息

configInfo

String

入参, json格式的字符串,填写一些扩展的参数,例如
configInfo = R"({"utility":"CommonUtility","enable_stereo":true,"enable_pre_delay":true})"

configInfo

参数类型说明
utilitystring入参,固定设置为CommonUtility

enable_stereo

bool

入参,默认值:false
当处理两个通道的数据时候,enable_stereo为ture,两个通道单独处理;enable_stereo为false,处理第一个通道后,拷贝结果覆盖第二个通道,节省一半计算量;

enable_pre_delay

bool

入参,默认值:false
算法需要送入足够的数据才会输出结果,在实时场景需要等进等出,enable_pre_delay=true,会在一开始返回静音缓冲数据,减少接入难度,建议rtc场景默认开启

返回值:
SAMICore* 返回算法指针
举例:

bool enable_pre_delay = false;
int max_block_size = sample_rate/100;  //10ms,根据实际设置
int process_block_size = max_block_size;

NSData *fileData = [NSData dataWithContentsOfFile:model_path];
if(fileData==nil){
    NSLog(@"read model file:%@,error",model_path);
    return -1;
}

//3.初始化算法
int result = 0;
SAMICore_CreateParameter *create_param = [[SAMICore_CreateParameter alloc] init];
create_param.sampleRate = sample_rate;
create_param.maxBlockSize = max_block_size;
create_param.modelBuffer = (char *)[fileData bytes];;
create_param.modelLen = [fileData length];
create_param.numChannel = num_channel;
create_param.numAudioBuffer = 1;
create_param.bussinessInfo="oc_demo";
if(enable_pre_delay){
    create_param.configInfo="{\"utility\":\"CommonUtility\",\"enable_stereo\":true,\"enable_pre_delay\":true}";
}else{
    create_param.configInfo="{\"utility\":\"CommonUtility\",\"enable_stereo\":true}";
}
SAMICore *sami_core_handle = [[SAMICore alloc] initWithIdentify:SAMICore_Identify_EngineExecutor_CE_DENOISE param:create_param result:&result];
if(result != SAMI_OK) {
    NSLog(@"create handler failed: %d",result);
    return -1;
}
  1. 设置降噪强度/重置算法句柄

函数名:

@interface SAMICore: NSObject
    - (int)setProperty:(SAMICore_Property* _Nonnull)inAudioProperty withId:(SAMICore_PropertyId)id;
@end

作用:
设置参数
参数说明:

参数名参数类型参数说明
inAudioPropertySAMICore_Property入参,具体的参数更新内容,见下文SAMICore_Property
idSAMICore_PropertyId入参, 参数的id

SAMICore_Property

参数名参数类型参数说明
typeSAMICore_DataType入参,参数的类型
idSAMICore_PropertyId入参, 参数的id
dataid入参,参数的内容
writableint入参,预留字段,可忽略
extraInfoconst char*入参,预留字段,可忽略

返回值
具体错误码参考SAMICoreCode.h

返回值含义
SAMI_OK成功
其他失败

举例:

  • 设置降噪强度

    一些情况下可能需要实现一种比较自然的感觉,并不需要将噪音消除的很彻底,通过接口设置降噪比率。仅支持在调用处理数据接口"processWithInBlock"之前使用。

SAMICore_Property *core_property = [[SAMICore_Property alloc] init];
core_property.id = SAMICore_PropertyID_Common_SetParam;
core_property.data = @"{ \"type\":\"alg_param\",\"param\":{ \"speech_ratio\":1.0 } }";
core_property.type = SAMICore_DataType_String;
SAMICore_PropertyId property_id = SAMICore_PropertyID_Common_SetParam;
result = [sami_core_handle setProperty:core_property withId:property_id];
if(result != SAMI_OK) {
    NSLog(@"setProperty failed: %d",result);
    return -1;
}

SAMICore_Property::data 格式介绍

{
    "type":"alg_param",
    "param":{
        "speech_ratio":1
    }
}

参数介绍

参数类型说明
speech_ratiofloat范围: 0.0 <= x <= 1.0; 默认是1.0,降噪比例最大

注意:

设置降噪强度,仅对降噪模型生效,去混响/去啸叫无作用

  • 重置算法

    当处理完一个音频后,处理新音频重置算法句柄可以清除算法内部的状态

SAMICore_Property *resetProperty = [[SAMICore_Property alloc] init];
resetProperty.id = SAMICore_PropertyID_Common_Reset;
resetProperty.type = SAMICore_DataType_Null;
result = [sami_core_handle setProperty:resetProperty withId:SAMICore_PropertyID_Common_Reset];
if(result != SAMI_OK) {
    NSLog(@"resetProperty failed:%d",result);
}
  1. 处理数据

函数名:

@interface SAMICore: NSObject
- (int)processWithInBlock:(SAMICore_AudioBlock* _Nullable)inBlock outBlock:(SAMICore_AudioBlock* _Nullable)outBlock;
@end

作用:
处理音频数据并获得处理后的音频数据
参数说明:

参数类型说明
inBlockSAMICore_AudioBlock*入参,用来传递需要处理的音频数据
outBlockSAMICore_AudioBlock*出参,用来传递算法返回的音频数据

SAMICore_AudioBlock

参数类型说明
dataTypeSAMICoreDataType入参,指明下面data的类型,这里输入的in_block为SAMICore_DataType_AudioBuffer
numberAudioDataunsigned int入参,用来标识下个参数data的个数,不是字节个数,此算法填1
datavoid*入参,表示具体的参数的值,类型参考上面type,设置为SAMICore_AudioBuffer

SAMICore_AudioBuffer

参数类型说明
numberChannelsunsigned int*入参,表示数据的通道数
numberSamplesunsigned int*入参,表示数据的样本个数
isInterleaveint入参,表示多通道是否交织存放,仅支持planner格式,设置为0;
datafloat**出/入参, 表示算法输入和输出的音频数据

返回值
具体错误码参考SAMICoreCode.h

返回值含义
SAMI_OK成功
SAMI_ENGINE_INPUT_NEED_MORE_DATA输入还不够算法处理需要,继续送数据后才会有输出
其他失败

举例:

//6.初始化buffer
SAMICore_AudioBuffer *in_audio_buffer = [[SAMICore_AudioBuffer alloc] init];
in_audio_buffer.numberChannels = num_channel;
in_audio_buffer.numberSamples = max_block_size;
in_audio_buffer.data = malloc(num_channel * sizeof(float *));
in_audio_buffer.isInterleave = 0;


SAMICore_AudioBuffer *out_audio_buffer = [[SAMICore_AudioBuffer alloc] init];
out_audio_buffer.numberChannels = num_channel;
out_audio_buffer.numberSamples = max_block_size;
out_audio_buffer.data = malloc(num_channel * sizeof(float *));
out_audio_buffer.isInterleave = 0;

for(int c = 0; c < num_channel; ++c) {
    ((float**)in_audio_buffer.data)[c] = malloc(max_block_size * sizeof(float *));
    ((float**)out_audio_buffer.data)[c] = malloc(max_block_size * sizeof(float *));
}

SAMICore_AudioBlock *in_block = [[SAMICore_AudioBlock alloc] init];
in_block.dataType = SAMICore_DataType_AudioBuffer;
in_block.numberAudioData = 1;
in_block.audioData = in_audio_buffer;

SAMICore_AudioBlock *out_block = [[SAMICore_AudioBlock alloc] init];
out_block.dataType = SAMICore_DataType_AudioBuffer;
out_block.numberAudioData = 1;
out_block.audioData = out_audio_buffer;                                                                                

while(true){
    size_t real_size = 0;
    float *samples_data = (float*)[file_src readWithNumFrame:process_block_size RealSize:&real_size];
    if(real_size<=0){
        NSLog(@"read file end");
        break;
    }

    //interleave to planar
    for(int ch = 0; ch < num_channel; ++ch) {
        for(int sample = 0; sample < real_size; ++sample) {
            ((float **)(in_audio_buffer.data))[ch][sample] = samples_data[num_channel*sample + ch];
        }
    }

    in_audio_buffer.numberSamples = real_size;
    out_audio_buffer.numberSamples = real_size;

    result = [sami_core_handle processWithInBlock:in_block outBlock:out_block];
    if(result != SAMI_OK) {
        if(result == SAMI_ENGINE_INPUT_NEED_MORE_DATA) continue;
        else {
            NSLog(@"process error:%d",result);
            break;
        }
    }

    // write output block to file
    if(out_audio_buffer.numberSamples>0){
        [audio_encoder writePlanarData:(float**)out_audio_buffer.data
                           NumChannels:num_channel
                   NumSamplePerChannel:out_audio_buffer.numberSamples];
    }

}

for(int c = 0; c < num_channel; ++c) {
    free(((float**)in_audio_buffer.data)[c]);
    free(((float**)out_audio_buffer.data)[c]);
}
free ((float**)in_audio_buffer.data);
free ((float**)out_audio_buffer.data);
  1. 获取延时时间/获取尾部数据

函数名:

@interface SAMICore: NSObject
    - (int)getProperty:(SAMICore_Property* _Nonnull)outAudioProperty withId:(SAMICore_PropertyId)id;
@end

作用:获取指定参数内容
参数说明:

参数名参数类型参数说明
outAudioPropertySAMICore_Property出参,具体参数内容,见下文SAMICore_Property
idSAMICore_PropertyId入参, 参数的id

SAMICore_Property

参数名参数类型参数说明
typeSAMICore_DataType入参,参数的类型
idSAMICore_PropertyId入参, 参数的id
dataid入参,参数的内容
writableint入参,预留字段,可忽略
extraInfoconst char*入参, 获取参数的附加信息,见示例

返回值
具体错误码参考SAMICoreCode.h

返回值含义
SAMI_OK成功
其他失败

举例:

  • 获取延时数据:

    创建算法句柄并设置pre_enable_delay=true之后,算法会在处理音频一开始补充静音数据,通过下文调用可以获取到具体增加了多少静音的数据;在"initWithIdentify"函数调用之后即可使用

    int delay_wait_samples_json_parse(NSString* jsonString){
        NSData * jsonData = [jsonString dataUsingEncoding:NSUTF8StringEncoding];
    
        NSError *error;
        NSDictionary *jsonObject = [NSJSONSerialization JSONObjectWithData:jsonData options:kNilOptions error:&error];
    
        if (jsonObject==nil) {
            NSLog(@"Error parsing JSON: %@", error);
            return -1;
        } else {
            if (jsonObject[@"delay_wait_samples"]) {
                int delayWaitSamples = [jsonObject[@"delay_wait_samples"] intValue];
                return delayWaitSamples;
            } else {
                NSLog(@"The key 'delay_wait_samples' is not found in the JSON object.");
                return -1;
            }
        }
    }
    
    SAMICore_Property *delay_property = [[SAMICore_Property alloc] init];
    delay_property.id = SAMICore_PropertyID_Common_GetParam;
    delay_property.type = SAMICore_DataType_String;
    delay_property.extraInfo="{\"type\":\"business_param\",\"param\":{\"name\":\"delay_wait_samples\"}}";
    result = [sami_core_handle getProperty:delay_property withId:SAMICore_PropertyID_Common_GetParam ];
    if(result != SAMI_OK) {
        NSLog(@"getProperty failed: %d",result);
        return -1;
    }
    //delay_property.data 返回数据类型NSString*,
    //内容例如:eg:"{\"delay_wait_samples\":2717.0}",可以用json解析工具解析
    NSLog(@"delay_wait_samples:%d",delay_wait_samples_json_parse(delay_property.data));
    

    SAMICoreProperty::extraInfo格式介绍

    {
        "type":"business_param",
        "param":{
            "name":"delay_wait_samples"
        }
    }
    

    参数介绍

    参数类型说明
    nameStringdelay_wait_samples表示获取算法会在开头补多少个静音的采样点;
  • 获取尾部数据:

    由于算法存在缓存部分数据,因此当音频已经全部送入算法后,需要调用这个接口,将所有缓存的数据取出;流式场景客户忽略这个接口,可以在处理数据环节之后使用;

    SAMICore_Property *flushProperty = [[SAMICore_Property alloc] init];
    flushProperty.id = SAMICore_PropertyID_Common_Flush;
    flushProperty.type = SAMICore_DataType_AudioBuffer;
    SAMICore_PropertyId flush_property_id = SAMICore_PropertyID_Common_Flush;
    result = [sami_core_handle getProperty:flushProperty withId:flush_property_id];
    if(result != SAMI_OK) {
        NSLog(@"flushProperty failed:%d",result);
    }else{
        NSArray* array = (NSArray*)flushProperty.data;
        SAMICore_AudioBuffer* buffer = (SAMICore_AudioBuffer*)array[0];
        [audio_encoder writePlanarData:(float**)buffer.data
                           NumChannels:num_channel
                   NumSamplePerChannel:buffer.numberSamples];
    }
    

完整示例

以下例子为ARC编译配置下代码示例

#import <Foundation/Foundation.h>
#import "SAMICore.h"
#import "SAMICoreFileSource.h"
#import "SAMICoreAudioEncoder.h"

int delay_wait_samples_json_parse(NSString* jsonString){
    NSData * jsonData = [jsonString dataUsingEncoding:NSUTF8StringEncoding];

    NSError *error;
    NSDictionary *jsonObject = [NSJSONSerialization JSONObjectWithData:jsonData options:kNilOptions error:&error];

    if (jsonObject==nil) {
        NSLog(@"Error parsing JSON: %@", error);
        return -1;
    } else {
        if (jsonObject[@"delay_wait_samples"]) {
            int delayWaitSamples = [jsonObject[@"delay_wait_samples"] intValue];
            return delayWaitSamples;
        } else {
            NSLog(@"The key 'delay_wait_samples' is not found in the JSON object.");
            return -1;
        }
    }
}

int main(int argc, char* argv[]) {
    @autoreleasepool {
        if(argc < 3) {
            NSLog(@"Usage: demo input.wav model.model output.wav");
            return -1;
        }

        NSLog(@"demo begin");
        NSString* input_path = [[NSString alloc] initWithCString:argv[1] encoding:NSUTF8StringEncoding];
        NSString* model_path = [[NSString alloc] initWithCString:argv[2] encoding:NSUTF8StringEncoding];
        NSString* output_path = [[NSString alloc] initWithCString:argv[3] encoding:NSUTF8StringEncoding];

        //1.初始化输入文件
        NSLog(@"input_path:%@, decoder",input_path);
        SAMICore_FileSource *file_src = [[SAMICore_FileSource alloc] create:input_path];
        if(!file_src) {
            NSLog(@"create File Source fail!");
            return -1;
        }

        size_t sample_rate = [file_src getSampleRate];
        size_t num_channel = [file_src getNumChannel];
        NSLog(@"input file sample_rate:%zu,num_channel:%zu",sample_rate,num_channel);

        //2.初始化输出文件
        NSLog(@"output_path:%@ encoder",output_path);
        SAMICore_AudioEncoderSettings *settings = [[SAMICore_AudioEncoderSettings alloc] initWithAudioEncoderFormat:kWav_F32
                                                                                           AudioEncoderAcceleration:kSoftware
                                                                                              AudioEncoderThreading:kSingleThreaded
                                                                                                         NumThreads:0];
        SAMICore_AudioEncoder *audio_encoder = [[SAMICore_AudioEncoder alloc] createWithAudioEncoderSettings:settings];
        if(!audio_encoder) {
            NSLog(@"create Audio Encoder fail!");
            return -1;
        }
        int result = [audio_encoder openWithOutputPath:output_path
                                            SampleRate:sample_rate
                                           NumChannels:num_channel
                                               BitRate:64];
        if(result != 0) {
            NSLog(@"open output file fail,ret:%d",result);
            return -1;
        }

        bool enable_pre_delay = false;
        int max_block_size = sample_rate/100;
        int process_block_size = max_block_size;

        NSData *fileData = [NSData dataWithContentsOfFile:model_path];
        if(fileData==nil){
            NSLog(@"read model file:%@,error",model_path);
            return -1;
        }

        //3.初始化算法
        SAMICore_CreateParameter *create_param = [[SAMICore_CreateParameter alloc] init];
        create_param.sampleRate = sample_rate;
        create_param.maxBlockSize = max_block_size;
        create_param.modelBuffer = (char *)[fileData bytes];;
        create_param.modelLen = [fileData length];
        create_param.numChannel = num_channel;
        create_param.numAudioBuffer = 1;
        create_param.bussinessInfo="oc_demo";
        if(enable_pre_delay){
            create_param.configInfo="{\"utility\":\"CommonUtility\",\"enable_stereo\":true,\"enable_pre_delay\":true}";
        }else{
            create_param.configInfo="{\"utility\":\"CommonUtility\",\"enable_stereo\":true}";
        }
        SAMICore *sami_core_handle = [[SAMICore alloc] initWithIdentify:SAMICore_Identify_EngineExecutor_Common param:create_param result:&result];
        if(result != SAMI_OK) {
            NSLog(@"create handler failed: %d",result);
            return -1;
        }

        //4.设置降噪强度
        SAMICore_Property *core_property = [[SAMICore_Property alloc] init];
        core_property.id = SAMICore_PropertyID_Common_SetParam;
        core_property.data = @"{ \"type\":\"alg_param\",\"param\":{ \"speech_ratio\":1.0 } }";
        core_property.type = SAMICore_DataType_String;
        SAMICore_PropertyId property_id = SAMICore_PropertyID_Common_SetParam;
        result = [sami_core_handle setProperty:core_property withId:property_id];
        if(result != SAMI_OK) {
            NSLog(@"setProperty failed: %d",result);
            return -1;
        }

        //5.获取延时时间
       if(enable_pre_delay){
            SAMICore_Property *delay_property = [[SAMICore_Property alloc] init];
            delay_property.id = SAMICore_PropertyID_Common_GetParam;
            delay_property.type = SAMICore_DataType_String;
            delay_property.extraInfo="{\"type\":\"business_param\",\"param\":{\"name\":\"delay_wait_samples\"}}";
            result = [sami_core_handle getProperty:delay_property withId:SAMICore_PropertyID_Common_GetParam ];
            if(result != SAMI_OK) {
                NSLog(@"getProperty failed: %d",result);
                return -1;
            }
            //delay_property.data 返回数据类型NSString*,
            //内容例如:eg:"{\"delay_wait_samples\":2717.0}",可以用json解析工具解析
            NSLog(@"delay_wait_samples:%d",delay_wait_samples_json_parse(delay_property.data));
        }

        //6.初始化buffer
        SAMICore_AudioBuffer *in_audio_buffer = [[SAMICore_AudioBuffer alloc] init];
        in_audio_buffer.numberChannels = num_channel;
        in_audio_buffer.numberSamples = max_block_size;
        in_audio_buffer.data = malloc(num_channel * sizeof(float *));
        in_audio_buffer.isInterleave = 0;


        SAMICore_AudioBuffer *out_audio_buffer = [[SAMICore_AudioBuffer alloc] init];
        out_audio_buffer.numberChannels = num_channel;
        out_audio_buffer.numberSamples = max_block_size;
        out_audio_buffer.data = malloc(num_channel * sizeof(float *));
        out_audio_buffer.isInterleave = 0;

        for(int c = 0; c < num_channel; ++c) {
            ((float**)in_audio_buffer.data)[c] = malloc(max_block_size * sizeof(float *));
            ((float**)out_audio_buffer.data)[c] = malloc(max_block_size * sizeof(float *));
        }

        SAMICore_AudioBlock *in_block = [[SAMICore_AudioBlock alloc] init];
        in_block.dataType = SAMICore_DataType_AudioBuffer;
        in_block.numberAudioData = 1;
        in_block.audioData = in_audio_buffer;

        SAMICore_AudioBlock *out_block = [[SAMICore_AudioBlock alloc] init];
        out_block.dataType = SAMICore_DataType_AudioBuffer;
        out_block.numberAudioData = 1;
        out_block.audioData = out_audio_buffer;

        while(true){
            size_t real_size = 0;
            float *samples_data = (float*)[file_src readWithNumFrame:process_block_size RealSize:&real_size];
            if(real_size<=0){
                NSLog(@"read file end");
                break;
            }

            //interleave to planar
            //float[left,right,left,right,***] -> float[left,left,***][right,right,***]
            for(int ch = 0; ch < num_channel; ++ch) {
                for(int sample = 0; sample < real_size; ++sample) {
                    ((float **)(in_audio_buffer.data))[ch][sample] = samples_data[num_channel*sample + ch];
                }
            }

            in_audio_buffer.numberSamples = real_size;
            out_audio_buffer.numberSamples = real_size;

            result = [sami_core_handle processWithInBlock:in_block outBlock:out_block];
            if(result != SAMI_OK) {
                if(result == SAMI_ENGINE_INPUT_NEED_MORE_DATA) continue;
                else {
                    NSLog(@"process error:%d",result);
                    break;
                }
            }

            // write output block to file
            if(out_audio_buffer.numberSamples>0){
                [audio_encoder writePlanarData:(float**)out_audio_buffer.data
                                   NumChannels:num_channel
                           NumSamplePerChannel:out_audio_buffer.numberSamples];
            }

        }

        //7.获取尾部数据
        if(enable_pre_delay==false){
            SAMICore_Property *flushProperty = [[SAMICore_Property alloc] init];
            flushProperty.id = SAMICore_PropertyID_Common_Flush;
            flushProperty.type = SAMICore_DataType_AudioBuffer;
            SAMICore_PropertyId flush_property_id = SAMICore_PropertyID_Common_Flush;
            result = [sami_core_handle getProperty:flushProperty withId:flush_property_id];
            if(result != SAMI_OK) {
                NSLog(@"flushProperty failed:%d",result);
            }else{
                NSArray* array = (NSArray*)flushProperty.data;
                SAMICore_AudioBuffer* buffer = (SAMICore_AudioBuffer*)array[0];
                [audio_encoder writePlanarData:(float**)buffer.data
                                   NumChannels:num_channel
                           NumSamplePerChannel:buffer.numberSamples];
            }
        }

        //8.reset算法处理器
        SAMICore_Property *resetProperty = [[SAMICore_Property alloc] init];
        resetProperty.id = SAMICore_PropertyID_Common_Reset;
        resetProperty.type = SAMICore_DataType_Null;
        result = [sami_core_handle setProperty:resetProperty withId:SAMICore_PropertyID_Common_Reset];
        if(result != SAMI_OK) {
            NSLog(@"resetProperty failed:%d",result);
        }

        //9.释放相关资源
        for(int c = 0; c < num_channel; ++c) {
            free(((float**)in_audio_buffer.data)[c]);
            free(((float**)out_audio_buffer.data)[c]);
        }
        free ((float**)in_audio_buffer.data);
        free ((float**)out_audio_buffer.data);
        [audio_encoder close];
        return 0;
    } // autoreleasepool
}
附加

信息

接入必读:


材料

SDK和demo等资源见:SDK版本发布--音频技术-火山引擎