【C/OC/Java】智能音频K歌解决方案--音频技术-火山引擎

文档中心

立即注册

导航

【C/OC/Java】智能音频K歌解决方案

最近更新时间：2023.11.13 14:43:34首次发布时间：2023.04.12 11:23:01

K歌体验SDK接入说明

录制页

alt

接口说明

C++：头文件为sami_core_karaoke_record_graph.h，调用类SAMI::KaraokeRecordGraph
OC ：头文件为SAMICoreKaraokeRecord.h，接口名称及功能与CPP对应
Java：头文件为SAMICoreKaraokeRecord.java, 接口名称及功能与CPP对应，部分参数和返回值不同，文档中标出，其中基础类型如bool(对应Java中boolean)和std::string(对应Java中的String)的差异未标出

功能	接口名称	接口参数说明	接口返回值说明及函数补充说明
初始化	C++/Java: `init` OC：`initRecordingGraphWithSettingParam`	C++: struct KaraokeRecordSettingParam { std::string accompany_path; // 伴奏文件路径 std::string original_path; // 原唱文件路径 int sample_rate; // 录播的采样率，44100/48000/16000 int max_block_samples; // 播放器一次请求的最大帧数。不超过65536 std::string extra_config; // 额外设置，如开启录播等 KaraokeMessageCallback message_callback; //埋点回调，可获取内部埋点信息 }; typedef std::function<void(KaraokeMessageId id, void* info)> KaraokeMessageCallback; OC: `SAMICore_KaraokeRecordSettingParam` Java:`SAMICoreKaraokeRecord.KaraokeRecordParamSetting`	0: 创建成功；否则会打印失败信息并返回错误码注意： 1. sdk默认输出双声道数据给耳返 `max_block_samples` 不得超过65536，否则会报错目前sdk中支持的伴奏文件和原唱文件的格式，仅支持：wav/mp3 采样率支持：目前仅支持44100/48000/16000 `extra_config`需遵循json格式传入，如`"{ \"backend_config\":{ \"need_backend\":true, \"loop\":false, \"dump_path\": \"xxx/xxx/" }}";` sdk录播仅支持android/ios端
传入录音数据	`pushMicAudioData`	`float** in_data` 录音数据，双通道且非交叉存储时需要拆分开 `int num_channels` 录音数据的通道数 `int num_samples` 传入数据的每通道采样点个数 `bool interleaved` 双通道时需准确设置	0: 成功；否则返回具体错误码。
开启回声消除	`setAECParam`	`std::string` aec模型的路径	0: 创建成功；否则返回具体错误码。使用aec44k_v2.2_modify_time_1s
开启延迟检测	`setTimeAlignParam`	`std::string` timeAlign模型路径	0: 创建成功；否则返回具体错误码。使用time_align_44k_v1.0.model
干声文件保存位置	`setOutVocalFileParam`	`std::string` 写出的文件路径	0: 创建encoder成功；否则返回具体错误码。注意：若干声保存不成功，则编辑页功能不可用。
开启K歌打分功能	`setSingScoreParam`	`int` score_mode 打分类型，目前仅支持1(音高打分) `std::string` lyric_path krc歌词文件路径 `std::string` midi_path midi文件路径	0: 创建成功；否则返回具体错误码 krc是主流的歌词文件
开启人声响度检测	`openVocalLoudnessExtractor`		0: 创建成功；否则返回具体错误码
内部环境准备	`prepare`		0: 成功；否则返回具体错误码。准备仅需调用一次
开始	`play`		0: 成功；否则返回具体错误码。可在prepare之后或者pause之后调用
暂停	`pause`		0: 成功；否则返回具体错误码。暂停后再pullAudioData时全部返回静音数据
拖动进度	`seek`	`float` seek_to_ms seek到的伴奏文件中的绝对位置，毫秒 `float` count_down_ms 倒计时需要持续的时间毫秒	0: 成功；否则返回具体错误码。2023-04-17开始支持 seek_to_ms是倒计时结束时的伴奏时间；count_down_ms是倒计时的持续时间；倒计时期间伴奏正常播放，但不会写录音数据到干声文件；且不会做打分计算。参数合法性检查：seek_to_ms >= 0 && seek_to_ms <= 伴奏总时长 && seek_to_ms >= count_down_ms 注意：为保证干声文件的延迟不变，建议在调用暂停和seek的接口时，录播都无需停止。
结束	`stop`		0: 成功；否则返回具体错误码。
播放线程中拉取播放数据	`pullAudioData`	`float**` 双声道时非交叉存储 `int num_samples` pull的数据长度(每通道的采样点数)	返回获取到的每通道数据采样点数。异常时返回-1。注意：`num_samples`不得超过`max_block_samples`，否则直接返回-1。
获取总时长	`getTotalDurationMs`		伴奏文件的时长，毫秒 (一般原唱和伴奏文件的时长是相同的。这里会取原唱和伴奏文件较短的时长)
获取当前进度	`getCurrentPositionMs`		当前录制进度，毫秒
更新耳返中的人声音量	`updateMonitorVocalVolume`	float: 调整的dB值, [-70, +35], -70时为静音。	耳返中人声的增益值。默认值为0，表示不调整音量
更新耳返中的伴奏音量	`updateMonitorBGMVolume`	float: 调整的dB值, [-70, +35]	耳返中伴奏的增益值
原唱和伴奏的切换	`switchBGMMode`	`enum` `KaraokeBGMMode{``Accompany, Original``}`	0: 成功；否则返回具体错误码。
调整bgm音高	`updateBGMPitch`	int：升/降的半音数, [-12, +12]	0: 成功；否则返回具体错误码
获取实时打分数据	`getRealTimeScoreInfo`	C++/Java: `SAMICoreMulDimSingScoringRealtimeInfo` OC：`SAMICore_MulDimSingScoringRealtimeInfo` 结构体信息在右侧备注。sdk中会给对应参数赋值	0: 成功；否则返回错误码，结果无效 `SAMICoreMulDimSingScoringRealtimeInfo`用于音准打分UI展示，参数说明如下： `double timeMilliseconds; 打分模块中当前结果的时间戳 double songScore; 已完成演唱的句子总得分 int sentenceCount; 已完成演唱的句子个数 int sentenceIndex; 已完成的最后一句歌词行编号 double sentenceScore; 已完成的最后一句单句得分 double userPitch; 用户实际演唱的note值 >0。为有效值 double refPitch; midi中当前时间的参考pitch值。>0 为有效值`
获取全局分数信息	`getOverallScoreInfo`	C++/Java: `SAMICoreMulDimSingScoringOverallInfo` OC：`SAMICore_MulDimSingScoringOverallInfo` 结构体信息在右侧备注。sdk中会给对应参数赋值	0: 成功；否则结果无效。结果结构体中包含`note_score`，为音准打分.
获取全局响度信息	`getLoudnessOverallFeatures`	`float& global_lufs` // 干声整体响度值 `float& global_peak;` // 干声整体peak值 sdk中会对这两个参数赋值	0: 计算成功；否则返回错误码 `global_lufs` 和 `global_peak`可用于编辑场景中的响度均衡。默认值为`(0, 0)`
获取延迟检测结果	`getTimeAlignResultMs`	`float delay_ms` 延迟值	0: 计算成功；否则返回错误码。 `delay_ms` 说明：mic相对ref的偏移值，正值时则是mic有延迟，
写出分析结果	`writeRecordInfoToFile`	入参为结果写出的文件路径	此函数将延迟检测和响度检测等结果写入文件中。在编辑页图init中需要传入这个文件给sdk 读取值使用
资源释放	`仅Java: release`		用于将native层资源释放。释放前需保证已经调用stop. 释放后则不可以再调用图的方法。

C++ 示例代码

#include "sami_core_karaoke_record_graph.h"

auto message_callback = [](KaraokeMessageId id, void* info) {
    // app report log 
};

int main(int argc, char* argv[]) {
    // init params
    SAMI::KaraokeRecordSettingParam setting_param;
    setting_param.accompany_path = "/path/to/accompany.wav";
    setting_param.original_path = "/path/to/original-sing.wav";
    setting_param.sample_rate = 44100;  // should be player samplerate
    setting_param.max_block_samples = 4096; // player callback buffersize
    setting_param.message_callback = message_callback;
 
    SAMI::KaraokeRecordGraph graph;
    int ret = graph.init(setting_param);
    if(ret != 0) {
        return -1;
    }
   
    // set record callback. Must 
    // graph.setMicSourceCallback(micCallback);
   
    // set recorded vocal file path. Must
    graph.setOutVocalFileParam("/path/to/vocal.wav");
   
    // turn on aec if needed. Succeed if ret == 0
    ret = graph.setAECParam("/path/to/aec.model");

    // turn on time align of music and vocal if needed. Succeed if ret == 0
    ret = graph.setTimeAlignParam("/path/to/time_align.model");
   
    // turn on vocal Londness detect if needed. Succeed if ret == 0
    ret = graph.openVocalLoudnessExtractor();
   
    // turn on pitch SingScore. Succeed if ret == 0
    graph.setSingScoreParam(1,
                         "/path/to/lyric.krc",
                         "/path/to/song.mid");
   
    // prepare, should be called once
    graph.prepare();
   
    // start the graph
    graph.play();
   
    // push mic data to sdk
    std::thread recordThread =  std::thread([&](){
        float** in_data; // record data
        int record_channel = 1; // maybe 2
        bool interleaved = false; // maybe true
        int frame = 0;
        while (recording_){    
            // copy date from device
            get_buffer_from_devices(in_data, &record_channel, &interleaved, &frame); // shoule be implemented
     
            graph.pushMicAudioData(in_data, num_channels, frame, interleaved);
        }
    });

    // mock play thread
    std::thread playThread = std::thread([&](){
        data[0] = new float[setting_param.max_block_samples];
        data[1] = new float[setting_param.max_block_samples];
        while (playing_) {
            int frames = graph.pullAudioData(data, setting_param.max_block_samples);
            // play
        }
        delete [] data[0];
        delete [] data[1];
    });
   
   // mock UI thread: get realTimeScore result and show the result
   std::thread scoreUIThread = std::thread([&](){
      while (playing_) {
          usleep(512.0 / setting_param.sample_rate * 1000.0);
          SAMICoreMulDimSingScoringRealtimeInfo info;
          int ret = graph.getRealTimeScoreInfo(info);
          // show the result
      }
   });
   
   // mock UI Interaction thread: update monitor vocal volume、update bgm mode etc.
   std::thread UIInteractionThread = std::thread([&](){
       while(playing_) {
           // update monitor vocal volume
           graph.updateMonitorVocalVolume(10);
           
           // update monitor bgm volume
           graph.updateMonitorBGMVolume(10);
           
           // switch bgm mode: Accompany or Original
           graph.switchBGMMode(KaraokeBGMMode::Original);
           
           // update bgm pitch
           graph.updateBGMPitch(4);
       }
   });
   
   {
       // pause the graph. After paused, the pullAudioData will get all zeros
       graph.pause();
       
       // resume again
       graph.play();
    }
   
    playThread.join();
    scoreUIThread.join();
    UIInteractionThread.join();
   
    // stop
    graph.stop();
   
    float loudness = 0;
    float peak = 0;
    ret = graph.getLoudnessOverallFeatures(loudness, peak);
    printf("vocal loudness: status = %d, lufs = %f, peak = %f\n", ret, loudness, peak);

    float delay_ms = 0;
    ret = graph.getTimeAlignResultMs(delay_ms);
    printf("time align result: %d, :%f\n", ret, delay_ms);

    SAMICoreMulDimSingScoringOverallInfo info;
    ret = graph.getOverallScoreInfo(info);
    printf("overall_score_info result: status = %d, note_score = %f \n", ret, info.note_score);
   
    // after stopped, write some result to json file, which will be used in EditPage 
    graph.writeRecordInfoToFile("path/to/record_info.json");
  
    return 0;
}

OC 示例代码

#include "SAMICoreKaraokeRecord.h"
#include "SAMICore.h"

int main() {   
    // init param
    SAMICore_KaraokeRecordSettingParam *param = [[SAMICore_KaraokeRecordSettingParam alloc] init];
    std::string accompany_path = "/path/to/accompany.wav";
    param.accompany_path = accompany_path.c_str();
    std::string original_path = "/path/to/original.wav";
    param.original_path = original_path.c_str();
    param.sample_rate = 44100;  // play samplerate
    param.max_block_samples = 4096; // player callback buffersize
    NSString *documentFilePath = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory,NSUserDomainMask,YES).firstObject;
    NSString *extra_config = [NSString stringWithFormat:@"{ \"backend_config\":{ \"need_backend\":true, \"loop\":false, \"dump_path\": \"%@\" } }", documentFilePath];
    param.extra_config = [extra_config UTF8String];
    param.message_callback = ^(SAMICore_KaraokeMessageId id, NSDictionary* info) {
        // app report log 
    };

    // create graph object
    SAMICoreKaraokeRecord* graph = [SAMICoreKaraokeRecord alloc];
    ret = [graph initRecordingGraphWithSettingParam:param];
    if(ret != 0) {
        return ret;
    }

    // turn on aec if needed
    std::string aec_model_path = "/path/to/aec.model";
    ret = [graph setAECParam:aec_model_path.c_str()];

    // turn on time_align if needed
    std::string time_align_model_path = "/path/to/time_align.model";
    ret = [graph setTimeAlignParam:time_align_model_path.c_str()];

    // turn on vocal volume detect if needed
    ret = [graph openVocalLoudnessExtractor];

    // turn on singscore if needed
    std::string lyric_path = "/path/to/song.krc";
    std::string midi_path = "/path/to/song.mid";
    ret = [graph setSingScoreParam:1 lyric_path:lyric_path.c_str() midi_path:midi_path.c_str()];
    
    // set vocal file saved path
    std::string out_vocal_path = "/path/to/vocal.wav";
    ret = [graph setOutVocalFileParam:out_vocal_path.c_str()];

    [graph prepare];
    [graph play];
    
    // mock UI thread
    std::thread UIThread = std::thread([&]() {
        while (playing_) {
            SAMICore_MulDimSingScoringRealtimeInfo* info = [[SAMICore_MulDimSingScoringRealtimeInfo alloc] init];
            int ret = [graph getRealTimeScoreInfo:info];   
        }
    });

    // mock interactive thread
    std::thread userInteractiveThread = std::thread([&]() {
        while (playing_) {
            std::this_thread::sleep_for(std::chrono::seconds(1));
            // update vocal volume
            [graph updateMonitorVocalVolume:(+5)];
            std::this_thread::sleep_for(std::chrono::seconds(1));
            // update bgm volume
            [graph updateMonitorBGMVolume:(+5)];
            std::this_thread::sleep_for(std::chrono::seconds(1));
            // switch bgm mode
            [graph switchBGMMode:SAMICore_KaraokeBGMMode_Original];
            
            std::this_thread::sleep_for(std::chrono::seconds(1));
            [graph switchBGMMode:SAMICore_KaraokeBGMMode_Accompany];
            
            // update bgm pitch
            std::this_thread::sleep_for(std::chrono::seconds(1));
            [graph updateBGMPitch:4];
        }
    });
   
    // stop the graph
    [graph stop];
   
    // get some global information
    SAMICore_MulDimSingScoringOverallInfo *overallInfo;
    [graph getOverallSingScoreInfo:overallInfo];

    float global_loudness = 0;
    float global_peak = 0;
    ret = [graph getLoudnessOverallFeatures:&global_loudness global_peak:&global_peak];
    
    float delay_ms = 0;
    ret = [graph getTimeAlignResultMs:&delay_ms];
    
    // write some golbal information to json file
    [graph writeRecordInfoToFile:"/path/to/record_info.json"];

    return 0;
 }

Java 示例代码

import com.mammon.audiosdk.SAMICoreKaraokeRecord;
import com.mammon.audiosdk.structures.SAMICoreMulDimSingScoringOverallInfo;
import com.mammon.audiosdk.structures.SAMICoreMulDimSingScoringRealtimeInfo;

public class SAMIKaraokeRecordDemo {
    private final SAMICoreKaraokeRecord recordGraphObj = new SAMICoreKaraokeRecord();
    private Thread UIInteractionThread = null;

    // build record graph and prepare
    public void prepare() {
        // res
        String accompany_path = "/path/to/accompany.wav";
        String original_path = "/path/to/original.wav";
        String midi_path = "/path/to/karaoke.mid";
        String krc_path = "/path/to/karaoke.krc";
        String vocal_path = "/path/to/karaoke_vocal.wav";
        String aec_model = "/path/to/aec.model";
        String time_align_model = "/path/to/time_align.model";
        String record_result_path = "/path/to/record_info.json";
    
        // param
        SAMICoreKaraokeRecord.KaraokeRecordParamSetting setting = new SAMICoreKaraokeRecord.KaraokeRecordParamSetting();
        setting.accompany_path = accompany_path;
        setting.original_path = original_path;
        setting.sample_rate = 44100; // 16000/44100/48000
        setting.max_block_samples = 1024; // player callback buffersize
        setting.extra_config = "{ \"backend_config\":{ \"need_backend\":true, \"loop\":false, \"dump_path\": \"/sdcard/Download\" }}";
        setting.message_callback = new SAMICoreKaraokeMessageCallBack() {
            @Override
            public void MessageTracker(SAMICoreKaraokeMessageId id, SAMICoreKaraokeInfo info) {
                // app report log 
            }
        };
    
        Log.i(TAG, "karaoke record param: " + setting.toString());
    
        // init graph
        int ret = recordGraphObj.init(setting);
        if (ret != 0) {
            Log.e(TAG, "startTest: record graph init failed");
            return;
        }
    
        ret = recordGraphObj.setAECParam(aec_model);
        if (ret == 0) {
            Log.i(TAG, "enable_aec and init succeed\n");
        } else {
            Log.e(TAG, "enable_aec but init failed\n");
        }
    
        ret = recordGraphObj.setTimeAlignParam(time_align_model);
        if (ret == 0) {
            Log.i(TAG, "enable_time_align and init succeed\n");
        } else {
            Log.e(TAG, "enable_time_align but init failed\n");
        }
    
        ret = recordGraphObj.setSingScoreParam(1, krc_path, midi_path);
        if (ret == 0) {
            Log.i(TAG, "enable_sing_score and init succeed\n");
        } else {
            Log.e(TAG, "enable_sing_score but init failed\n");
        }
    
        ret = recordGraphObj.openVocalLoudnessExtractor();
        if (ret == 0) {
            Log.i(TAG, "enable_vocal_loudness and init succeed\n");
        } else {
            Log.e(TAG, "enable_vocal_loudness but init failed\n");
        }
    
        recordGraphObj.setOutVocalFileParam(vocal_path);
    
        ret = recordGraphObj.prepare();
        if (ret == 0) {
            Log.i(TAG, "graph prepare succeed\n");
        } else {
            Log.e(TAG, "graph prepare failed\n");
        }
    }
    
    public void play() {
        // ui interaction thread, mock function
        UIInteractionThread = new Thread() {
            // update monitor vocal volume
            recordGraphObj.updateMonitorVocalVolume(10);
           
            // update monitor bgm volume
            recordGraphObj.updateMonitorBGMVolume(10);
           
            // switch bgm mode: Accompany or Original
            recordGraphObj.switchBGMMode(SAMICoreKaraokeRecord.KaraokeBGMMode.Original);
           
            // update bgm pitch
            recordGraphObj.updateBGMPitch(4);
        }
        
        UIInteractionThread.start();
    }
    
    public void stop() {
        try {
            UIInteractionThread.join();
        } catch (InterruptedException e) {
            e.printStackTrace();
        };

        ret = recordGraphObj.stop();
        if (ret == 0) {
            System.out.println("graph stop succeed");
        } else {
            System.out.println("graph stop failed");
        }

        ret = recordGraphObj.writeRecordInfoToFile(record_result_path);
        if (ret == 0) {
            System.out.println("graph set info file succeed");
        } else {
            System.out.println("graph set info file failed");
        }

        float[] loudness = new float[1];
        float[] peak = new float[1];
        ret = recordGraphObj.getLoudnessOverallFeatures(loudness, peak);
        System.out.println("vocal loudness: status = " + ret + ", lufs = " + loudness[0] + " , peak = " + peak[0]);
        
        float[] delay_ms = new float[1];
        ret = recordGraphObj.getTimeAlignResultMs(delay_ms);
        System.out.println("time align result: " + ret + ":" + delay_ms[0]);

        SAMICoreMulDimSingScoringOverallInfo info = new SAMICoreMulDimSingScoringOverallInfo();
        ret = recordGraphObj.getOverallScoreInfo(info);
        System.out.println("overall_score_info result: status = " + ret + ", note_score = " + info.note_score);
    }
   
}

编辑页

注意

编辑页请求的数据可以用于播放或者保存到文件

alt

接口说明

C++：头文件为sami_core_karaoke_edit_graph.h，调用类SAMI::KaraokeEditGraph
OC ：头文件为SAMICoreKaraokeEdit.h，接口名称及功能与CPP对应
Java：SAMICoreKaraokeEdit.java，接口名称及功能与C++对应，部分参数和返回值不同，文档中标出

功能	接口名称	接口参数说明	接口返回值说明及函数补充说明
初始化	`init`	C++: struct KaraokeEditSettingParam { std::string vocal_path; // 干声路径，拍摄页时保存的音频文件 std::string bgm_path; // 伴奏路径 std::string record_result_path; // 拍摄页stop后保存的一个json文件，包含 int sample_rate; //编辑页播放的采样率 int max_block_samples; // 播放器每次请求的每通道数据最大采样点数 std::string extra_config; // 额外设置，如开启录播等 KaraokeMessageCallback message_callback; //埋点回调，可获取内部埋点信息 }; typedef std::function<void(KaraokeMessageId id, void* info)> KaraokeMessageCallback; OC： `SAMICore_KaraokeEditSettingParam` Java：`SAMICoreKaraokeEdit.KaraokeEditSettingParam`	0: 成功；否则返回具体错误码，并打印错误信息注意： `max_block_samples` 不得超过65536，否则会报错目前sdk中支持的伴奏文件格式，仅包括：wav/mp3 `extra_config`需遵循json格式传入，如`"{ \"backend_config\":{ \"need_backend\":true, \"loop\":false, \"dump_path\": \"xxx/xxx/" }}";` sdk录播仅支持android/ios端
开启降噪功能	`setDenoiseModelPath`	`std::string` denoise模型的路径	0: 成功；否则返回具体错误码。目前仅支持unet_denoise_44k_music_model_v1.0.model。注意，设置成功之后则默认打开降噪功能
降噪的开关	`setUseDenoise`	`bool` 是否使用降噪功能	0: 成功；否则返回具体错误码。注意：需要在开启降噪功能成功之后调用时才生效
伴奏开启响度均衡功能	`setBGMLoudnormInfo`	C++/Java: `SAMICoreLoudnormProperty` OC: `SAMICore_LoudnormProperty`	`struct SAMICoreLoudnormProperty { float source_lufs; // 原始响度，lufs float source_peak; // 原始峰值 float target_lufs; // 目标响度 }`
人声开启响度均衡功能	`setVocalLoudnormInfo`	C++/Java: `SAMICoreLoudnormProperty` OC: `SAMICore_LoudnormProperty`	若拍摄页开启了响度检测，且编辑页传入了`record_result_path`，则会默认开启人声响度均衡
设置/切换音效	`updateEffectFilePath`	`std::string` 资源文件路径	0: 成功；否则返回具体错误码。
设置人声伴奏对齐值	`setVocalOffsetMs`	`float` 毫秒级时间	调整人声和伴奏的偏移值(-1000至+1000范围)。可以使用延迟检测的结果作为参考值
获取总时长	`getTotalDurationMs`		编辑页可播放的总时长单位毫秒，这里取干声文件的时长
获取当前进度	`getCurrentPositionMs`		当前播放进度，毫秒
播放线程中拉取播放数据	`pullAudioData`	`float**` 目前仅支持双声道非交叉存储 `int num_samples` pull的数据长度	返回获取到的每通道数据采样点数。异常时返回-1。注意：`num_samples`不得超过`max_block_samples`，否则直接返回-1。
内部环境准备	`prepare`		0: 成功；否则返回具体错误码。内部环境准备。仅需调用一次
开始	`play`		0: 成功；否则返回具体错误码。可在prepare之后或者pause之后调用
进度控制	`seek`	`float seekMs` 毫秒级时间	0: 成功；否则返回具体错误码。
暂停	`pause`		0: 成功；否则返回具体错误码。暂停后再pullAudioData时全部返回静音数据
结束	`stop`		0: 成功；否则返回具体错误码。结束之后，不可以再拉取数据或者设置参数
更新人声音量	`updateVocalVolume`	`float value_db`取值范围[-70, +35], -70时为静音
更新伴奏音量	`updateBGMVolume`	`float value_db`取值范围[-70, +35], -70时为静音
导出指定片段音频	`exportAudioDataToAudioFile`	`std::string` 导出文件路径 `float` 截取片段开始时间，毫秒级单位，传入超过干声文件长度则报错 `float` 截取片段结束时间，毫秒级单位，传入超过干声文件长度默认取干声文件长度 C++/Java/OC:`std::function<void(float)>/SAMICoreKaraokeProgressCallbackprogress_callback/SAMICore_KaraokeProgressCallback` 导出进度回调，可通过此回调获取当前导出进度以做ui展示	0: 成功；否则返回具体错误码。android支持wav、mp3，ios支持aac、wav。必须在准备状态（graph已调用prepare未调用play的状态）或暂停状态（graph调用pause的状态）调用，不可在调用pullAudioData时调用。

C++ 示例代码

#include "sami_core_karaoke_edit_graph.h"
#include "sami_core.h"

auto message_callback = [](KaraokeMessageId id, void* info) {
    // app report log 
};

int main(int argc, char* argv[]) {
    // ypu must notify your token first

    // params
    SAMI::KaraokeEditSettingParam setting_param;
    setting_param.vocal_path = "/path/to/vocal.wav";
    setting_param.bgm_path = "/path/to/bgm.wav";
    setting_param.record_result_path = "/path/to/record_info.json";
    setting_param.sample_rate = 44100;  // player samplerate
    setting_param.max_block_samples = 1024; // player callback buffersize
    setting_param.message_callback = message_callback;
   
    SAMI::KaraokeEditGraph graph;
    int ret = graph.init(setting_param);
    if(ret != 0) {
        return -1;
    }

    // turn on denoise if needed. Succeed if ret == 0
    ret = graph.setDenoiseModelPath("/path/to/denoise.model");
   
    // use or change effect 
    ret = graph.updateEffectFilePath(preset_dir + effect_path.front());
    assert(ret == 0);

    // turn on bgm Loudnorm if needed and bgm lundness given. Succeed if ret == 0
    SAMICoreLoudnormProperty bgm_loudnorm{-24, -8.09, -16.0};
    ret = graph.setBGMLoudnormInfo(bgm_loudnorm);

    graph.prepare();
    graph.play();
    
    float totalDuration = graph.getTotalDurationMs();
    
    // mock monitor get data with writing to file
    std::thread playThread = std::thread([&]() {
        float* data[2];
        data[0] = new float [setting_param.max_block_samples];
        data[1] = new float [setting_param.max_block_samples];
        while(graph.getCurrentPositionMs() < totalDuration) {
            int frames = graph.pullAudioData(data, setting_param.max_block_samples);
            // play
        }
        delete [] data[0];
        delete [] data[1];
    });

    // mock user interactive thread
    std::thread userInteractiveThread = std::thread([&]() {
        while(/* playing */) {
            std::this_thread::sleep_for(std::chrono::seconds(5));
            // update vocal volume
            graph.updateVocalVolume(+10);
            
            // update bgm volume
            graph.updateBGMVolume(+10);
            
            // change effect
            graph.updateEffectFilePath(preset_dir + effect_path[(effect_index++) % effect_path.size()]);
            
            // update bgm and vocal offset
            graph.setVocalOffsetMs(-200);
        }
    });
    
    playThread.join();
    userInteractiveThread.join();
    
    graph.stop();
    
    return 0;
}

OC 示例代码

#import "SAMICoreKaraokeEdit.h"

{
    // 1. create edit graph
    SAMICore_KaraokeEditSettingParam* param = [[SAMICore_KaraokeEditSettingParam alloc] init];

    std::string bgm_path = "/path/to/accompany.wav";
    std::string vocal_path = "/path/to/vocal.wav";
    std::string record_result_path = "/path/to/record_info.json"; //拍摄页的分析结果写入到的文件

    param.bgm_path = bgm_path.c_str();
    param.vocal_path = vocal_path.c_str();
    param.record_result_path = record_result_path.c_str();
    param.sample_rate = 44100;
    param.max_block_samples = 1024; // player callback buffersize
    NSString *documentFilePath = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory,NSUserDomainMask,YES).firstObject;
    NSString *extra_config = [NSString stringWithFormat:@"{ \"backend_config\":{ \"need_backend\":true, \"loop\":false, \"dump_path\": \"%@\" } }", documentFilePath];
    param.extra_config = [extra_config UTF8String];
    param.message_callback = ^(SAMICore_KaraokeMessageId id, NSDictionary* info) {
        // app report log 
    };

    SAMICoreKaraokeEdit* graph = [SAMICoreKaraokeEdit alloc]; 
    int ret = [graph initEditGraphWithSettingParam:param];
    if(ret != 0) {
        return ret;
    }
    
    // total durtion ms
    float durationMS = [graph getTotalDuratioinMs];

    // 2. use denoise if needed. Succeed if ret == 0
    std::string denoise_model_path = "/path/to/denoise.model";
    ret = [graph setDenoiseModelPath: denoise_model_path.c_str()];

    // 3. use effect. Succeed if ret == 0
    std::string effect_path = "/path/to/effect";
    ret = [graph updateEffectFilePath:effect_path.c_str()];

    // 4. turn on bgm Loudnorm if needed and bgm lundness given. Succeed if ret == 0
    SAMICore_LoudnormProperty *property = [[SAMICore_LoudnormProperty alloc] init];
    property.source_lufs = -24;
    property.source_peak = -8;
    property.target_lufs = -16;
    ret = [graph setBGMLoudnormInfo:property];

    // begin run the graph
    [graph prepare];
    [graph play];

    // mock user interactive thread
    std::thread userInteractiveThread = std::thread([&]() {
        while(playing_) {
            std::this_thread::sleep_for(std::chrono::seconds(5));
            [graph updateVocalVolume: +10];  // 调节人声音量
            [graph updateBGMVolume: +10];    // 调节伴奏音量
            [graph setUseDenoise:false];   // 关闭降噪
            [graph setVocalOffsetMs:200];  // 设置人声伴奏便偏移，UI 
        }
    });

    playing_ = false;
    
    playThread.join();
    userInteractiveThread.join();
    
    // stop
    [graph stop];
}

Java示例代码

import com.mammon.audiosdk.SAMICoreKaraokeEdit;

public class SAMIKaraokeEditDemo {
    private final SAMICoreKaraokeEdit editGraphObj = new SAMICoreKaraokeEdit();
    private Thread UIInteractionThread = null;
    
    public void prepare() {
        String vocal_path = "/path/to/vocal.wav";
        String bgm_path = "/path/to/accompany.wav";
        String record_result_path = "/path/to/record_info.json";
        String denoise_model_path = "/path/to/denoise.model";
        String effect_path = "/path/to/minions.dat";
    
        SAMICoreKaraokeEdit.KaraokeEditSettingParam setting = new SAMICoreKaraokeEdit.KaraokeEditSettingParam();
        setting.vocal_path = vocal_path;
        setting.bgm_path = bgm_path;
        setting.record_result_path = record_result_path;
        setting.sample_rate = samplerateInConfig;
        int bufferSize = 1024;
        setting.max_block_samples = bufferSize; // player callback buffersize
        setting.extra_config = "{ \"backend_config\":{ \"need_backend\":true, \"loop\":false, \"dump_path\": \"/sdcard/Download\" } }";
        setting.message_callback = new SAMICoreKaraokeMessageCallBack() {
            @Override
            public void MessageTracker(SAMICoreKaraokeMessageId id, SAMICoreKaraokeInfo info) {
                // app report log 
            }
        };
    
        Log.i(TAG, "karaoke edit param: " + setting.toString());
    
        // init graph
        int ret = editGraphObj.init(setting);
        if (ret != 0) {
            Log.e(TAG, "startTest: edit graph init failed");
            return;
        }
    
        ret = editGraphObj.setDenoiseModelPath(denoise_model_path);
        if (ret == 0) {
            Log.i(TAG, "enable_denoise and init succeed\n");
        } else {
            Log.e(TAG, "enable_denoise but init failed\n");
        }
    
        ret = editGraphObj.updateEffectFilePath(effect_path);
        if (ret == 0) {
            Log.i(TAG, "use effect and init succeed\n");
        } else {
            Log.e(TAG, "use effect but init failed\n");
        }
    
        ret = editGraphObj.prepare();
        if (ret == 0) {
            Log.i(TAG, "graph prepare succeed\n");
        } else {
            Log.e(TAG, "graph prepare failed\n");
        }
    }
    
    public void play() {
        editGraphObj.play();
        // ui interaction thread, mock function
        UIInteractionThread = new Thread() {
            // update vocal volume
            editGraphObj.updateVocalVolume(+10);
                
            // update bgm volume
            editGraphObj.updateBGMVolume(+10);
            
            // change effect
            editGraphObj.updateEffectFilePath(new_effect_path);
            
            // update bgm and vocal offset
            editGraphObj.setVocalOffsetMs(-200);
        }
    }
    
    public void stop() {
        try {
            UIInteractionThread.join();
        } catch (InterruptedException e) {
            e.printStackTrace();
        };
        
        editGraphObj.pause();
        editGraphObj.exportAudioDataToAudioFile(file_path, 0, -1, new SAMICoreKaraokeEdit.SAMICoreKaraokeProgressCallback() {
            @Override
            public void progressCallback(float current_time) {
                Log.i(TAG, "current Time is " + current_time + "ms");
            }
        }); 
        
        editGraphObj.stop();
    }
}

工具函数

midi解析

C++版

#include "karaoke_utils.h"
// static method to parse midi file. Return a state value and midi content
std::tuple<int, MidiFileContent> KaraokeUtils::parseMidiFile(std::string midiFilePath)


//===== infomation of MidiFileContent ===========
struct MidiPitchInfo {
    int startMs{-1};
    int durationMs{-1};
    int pitch{0};
};
using MidiFileContent = std::vector<MidiPitchInfo>;

Java版调用示例

import com.mammon.audiosdk.MammonIo;

MammonIo io = new MammonIo();
MammonMidiNote[] notes = io.readMidiNotesFromFile(midi_path, -1, true);

krc解析

#include "karaoke_utils.h"

// static method to parse krc file. Return a state value and krc's content
std::tuple<int, KrcFileContent> parseLyricFile(const std::string& krcFilePath);


//===== infomation of KrcFileContent ===========

// one word info in krc files
struct KrcWordInfo {
    int startOffsetMs{-1};
    int durationMs{-1};
    std::string word;
};

// one sentence content which contains several words
struct KrcLineContent {
    int lineStartMs{-1};
    int lineDuration{-1};
    std::string lyricStr;
    std::vector<KrcWordInfo> lineWordsInfo;
};

using KrcFileContent = std::vector<KrcLineContent>;

注意事项：

音量调整

说明: 拍摄页和编辑页均支持更新人声和伴奏音量。这里做详细说明。
传入参数为增益值, 单位dB. 0dB即输入输出不变。我们支持的理论范围是[-70, +35], 超出此范围则会按照对应边界值生效。
用户可调范围不必要[-70, +35]这么大，可以是[-30, +6]即可。或者稍微有调整。加太大时，整体音量很大，但是为了避免爆音，就用limiter压下来了，就会将底噪等声音凸显出来。
音量滑杆：建议将音量滑杆与可调范围建立线性映射，调节时感受更顺滑。注意，在滑杆0位置，需要传入-70(表示静音)

拍摄页保存json说明

拍摄页stop之后，可以通过调用 record_graph.writeRecordInfoToFile(std::string)将拍摄页的部分状态和结果写到一个json文件中，便于编辑页使用。字段内容可扩展，目前保存结果如下：

{
    "bgm_latency_ms":"23.219955", // 拍摄页bgm延时时长
    "bgm_pitch_shift_semitone_normalised":"0.000000", // 拍摄页bgm调整的音高的归一值
    "startTimeMs": 0,       // 音频开始时间，相对于bgm可能有偏移, 比如从第10s开始录制
    "endTimeMs": 60000,     // 音频结束时间
    "enableAEC": 1,         // 回声消除功能是否创建成功。成功为1，否则为0
    "enableTimeAlign": 1,   // 延迟检测功能是否创建成功。成功为1，否则为0
    "enableLoudnorm": 1,    // 响度检测功能是否创建成功。成功为1，否则为0
    "enableSingScore": 1,   // 音高打分功能是否创建成功。成功为1，否则为0
    "loudnessResult":{
        "status": 0,        // 响度检测是否结果正常。正常为0，否则为错误码
        "peak": 1.0,        // 响度检测结果：音频中的幅值峰值
        "global_loudness": -20 // 响度检测结果：全局lufs
    },
    "scoreResult": {
        "status": 0,        // 打分状态码。0为正常
        "note_score": 45.0, // 音高总分数
        "emotion_score": 0, // 情感总分数，目前为0
        "rhythm_score":0    // 节奏总分数，目前为0
    },
    "timeAlignResult": {
        "status": 0,        // 延迟检测状态码。0为正常
        "delay_ms": 200     // 延迟检测结果：人声和伴奏的偏移值ms
    }
}

错误码详解

详细可查看sami_core_error_code.h

错误码名称	错误码	含义
SAMI_KARAOKE_IMPL_ERROR	200001	内部类对象为空，出现原因常为鉴权失败
SAMI_KARAOKE_GRAPH_STATE_ERROR	200002	内部状态错误，出现原因常为调用顺序错误
SAMI_KARAOKE_PARAM_ERROR	200003	传入参数错误，请检查参数是否正确
SAMI_KARAOKE_CONTEXT_ERROR	200004	内部对象context为空，出现原因常为未prepare或prepare失败
SAMI_KARAOKE_SET_VOLUME_FAILED	200005	设置音量发生错误，请检查音量参数是否正确
SAMI_KARAOKE_RECORD_PRE_PROCESS_ERROR	200006	拍摄页内部处理器错误，出现原因常为未prepare或prepare失败
SAMI_KARAOKE_RECORD_CREATE_AUDIO_FILE_FAILED	200007	拍摄页创建干声文件失败，请检查文件路径、权限是否正确
SAMI_KARAOKE_RECORD_AUDIO_FILE_ERROR	200008	拍摄页干声文件为空，请检查是否初始化干声文件
SAMI_KARAOKE_RECORD_GET_TIME_ALIGN_RESULT_FAILED	200009	拍摄页获取延迟检测结果失败，请检查延迟检测初始化是否成功
SAMI_KARAOKE_RECORD_GET_SING_SCORE_RESULT_FAILED	200010	拍摄页获取打分结果失败，请检查打分初始化是否成功
SAMI_KARAOKE_RECORD_GET_LOUDNESS_RESULT_FAILED	200011	拍摄页获取响度检测结果失败，请检查响度检测初始化是否成功
SAMI_KARAOKE_RECORD_SWITCH_AUDIO_MODE_FAILED	200012	拍摄页切换bgm模式失败，请检查传入模式是否正确
SAMI_KARAOKE_EDIT_AUDIO_FILE_FORMAT_NOT_SUPPORT	200013	编辑页传入导出文件格式不支持，请检查传入格式是否在sdk支持范围内
SAMI_KARAOKE_EDIT_PARSE_JSON_ERROR	200014	编辑页解析json文件失败，请检查传入json文件是否正确
SAMI_KARAOKE_EDIT_VOCAL_FILE_INVAILID	200015	编辑页传入干声文件无效，请检查传入干声文件是否正确
SAMI_KARAOKE_EDIT_DENOISE_PROCESSER_ERROR	200016	编辑页降噪处理器为空，请检查降噪初始化是否成功
SAMI_KARAOKE_EDIT_LOUDNORM_PROCESSER_ERROR	200017	编辑页音量均衡处理器为空，请检查音量均衡初始化是否成功
SAMI_KARAOKE_EDIT_EFFECT_PROCESSER_ERROR	200018	编辑页音效处理器为空，请检查音效初始化是否成功
SAMI_KARAOKE_EDIT_BYTETUNER_PROCESSER_ERROR	200019	编辑页修音处理器为空，请检查修音初始化是否成功
SAMI_KARAOKE_EDIT_SET_OFFSET_FAILED	200020	编辑页设置干声偏移失败，请检查传入偏移值是否正确
SAMI_KARAOKE_SET_PITCH_SHIFT_FAILED	200021	设置bgm音高失败，请检查传入半音值是否正确
SAMI_KARAOKE_PARSE_EXTRA_CONFIG_ERROR	200022	解析extra_config失败，请检查传入extra_config字符串是否为标准json格式
SAMI_KARAOKE_INIT_BACKEND_FAILED	200023	初始化录播器失败，请检查传入extra_config是否正确

音频技术

录制页

接口说明

C++ 示例代码

OC 示例代码

Java 示例代码

编辑页

接口说明

C++ 示例代码

OC 示例代码

Java示例代码

工具函数

midi解析

krc解析

注意事项：

音量调整

拍摄页保存json说明

错误码详解