You need to enable JavaScript to run this app.
导航

【C/OC/Java】智能音频K歌解决方案

最近更新时间2023.11.13 14:43:34

首次发布时间2023.04.12 11:23:01

K歌体验SDK接入说明

录制页

alt

接口说明

C++:头文件为sami_core_karaoke_record_graph.h,调用类SAMI::KaraokeRecordGraph
OC :头文件为SAMICoreKaraokeRecord.h,接口名称及功能与CPP对应
Java:头文件为SAMICoreKaraokeRecord.java, 接口名称及功能与CPP对应,部分参数和返回值不同,文档中标出,其中基础类型如bool(对应Java中boolean)和std::string(对应Java中的String)的差异未标出

功能
接口名称
接口参数说明
接口返回值说明 及函数补充说明

初始化

C++/Java: init
OC:initRecordingGraphWithSettingParam

C++:

struct KaraokeRecordSettingParam {
    std::string accompany_path; // 伴奏文件路径
    std::string original_path; // 原唱文件路径
    int sample_rate; // 录播的采样率,44100/48000/16000
    int max_block_samples; // 播放器一次请求的最大帧数。不超过65536
    std::string extra_config; // 额外设置,如开启录播等
    KaraokeMessageCallback message_callback; //埋点回调,可获取内部埋点信息
};
typedef std::function<void(KaraokeMessageId id, void* info)> KaraokeMessageCallback;

OC: SAMICore_KaraokeRecordSettingParam
Java:SAMICoreKaraokeRecord.KaraokeRecordParamSetting

0: 创建成功;否则会打印失败信息并返回错误码
注意:

1. sdk默认输出双声道数据给耳返

  1. max_block_samples 不得超过65536,否则会报错

  2. 目前sdk中支持的伴奏文件和原唱文件的格式,仅支持:wav/mp3

  3. 采样率支持:目前仅支持44100/48000/16000

  4. extra_config需遵循json格式传入,如"{ \"backend_config\":{ \"need_backend\":true, \"loop\":false, \"dump_path\": \"xxx/xxx/" }}";

  5. sdk录播仅支持android/ios端

传入录音数据

pushMicAudioData

float** in_data 录音数据,双通道且非交叉存储时需要拆分开
int num_channels 录音数据的通道数
int num_samples 传入数据的每通道采样点个数
bool interleaved 双通道时需准确设置

0: 成功;否则返回具体错误码。

开启回声消除

setAECParam

std::string aec模型的路径

0: 创建成功;否则返回具体错误码。
使用aec44k_v2.2_modify_time_1s

开启延迟检测

setTimeAlignParam

std::string timeAlign模型路径

0: 创建成功;否则返回具体错误码。
使用time_align_44k_v1.0.model

干声文件保存位置

setOutVocalFileParam

std::string 写出的文件路径

0: 创建encoder成功;否则返回具体错误码。
注意:若干声保存不成功,则编辑页功能不可用。

开启K歌打分功能

setSingScoreParam

int score_mode 打分类型,目前仅支持1(音高打分)
std::string lyric_path krc歌词文件路径
std::string midi_path midi文件路径

0: 创建成功;否则返回具体错误码
krc是主流的歌词文件

开启人声响度检测openVocalLoudnessExtractor0: 创建成功;否则返回具体错误码
内部环境准备prepare0: 成功;否则返回具体错误码。 准备仅需调用一次
开始play0: 成功;否则返回具体错误码。可在prepare之后或者pause之后调用
暂停pause0: 成功;否则返回具体错误码。暂停后再pullAudioData时全部返回静音数据

拖动进度

seek

float seek_to_ms seek到的伴奏文件中的绝对位置,毫秒
float count_down_ms 倒计时需要持续的时间 毫秒

0: 成功;否则返回具体错误码。2023-04-17开始支持
seek_to_ms是倒计时结束时的伴奏时间;count_down_ms是倒计时的持续时间;倒计时期间伴奏正常播放,但不会写录音数据到干声文件;且不会做打分计算。
参数合法性检查:seek_to_ms >= 0 && seek_to_ms <= 伴奏总时长 && seek_to_ms >= count_down_ms
注意:为保证干声文件的延迟不变,建议在调用暂停和seek的接口时,录播都无需停止。

结束stop0: 成功;否则返回具体错误码。

播放线程中拉取播放数据

pullAudioData

float** 双声道时非交叉存储
int num_samples pull的数据长度(每通道的采样点数)

返回获取到的每通道数据采样点数。异常时返回-1。 注意:num_samples不得超过max_block_samples,否则直接返回-1。

获取总时长getTotalDurationMs伴奏文件的时长,毫秒 (一般原唱和伴奏文件的时长是相同的。这里会取原唱和伴奏文件较短的时长)
获取当前进度getCurrentPositionMs当前录制进度,毫秒
更新耳返中的人声音量updateMonitorVocalVolumefloat: 调整的dB值, [-70, +35], -70时为静音。耳返中人声的增益值。默认值为0,表示不调整音量
更新耳返中的伴奏音量updateMonitorBGMVolumefloat: 调整的dB值, [-70, +35]耳返中伴奏的增益值
原唱和伴奏的切换switchBGMModeenum KaraokeBGMMode{Accompany, Original}0: 成功;否则返回具体错误码。
调整bgm音高updateBGMPitchint:升/降的半音数, [-12, +12]0: 成功;否则返回具体错误码

获取实时打分数据

getRealTimeScoreInfo

C++/Java: SAMICoreMulDimSingScoringRealtimeInfo OC:SAMICore_MulDimSingScoringRealtimeInfo
结构体信息在右侧备注。sdk中会给对应参数赋值

0: 成功;否则返回错误码,结果无效 SAMICoreMulDimSingScoringRealtimeInfo用于音准打分UI展示,参数说明如下:

double timeMilliseconds; 打分模块中当前结果的时间戳
double songScore;  已完成演唱的句子总得分
int sentenceCount; 已完成演唱的句子个数

int sentenceIndex;  已完成的最后一句歌词行编号
double sentenceScore;  已完成的最后一句单句得分

double userPitch;  用户实际演唱的note值 >0。 为有效值
double refPitch;   midi中当前时间的参考pitch值。>0 为有效值

获取全局分数信息

getOverallScoreInfo

C++/Java: SAMICoreMulDimSingScoringOverallInfo OC:SAMICore_MulDimSingScoringOverallInfo
结构体信息在右侧备注。sdk中会给对应参数赋值

0: 成功;否则结果无效。 结果结构体中包含note_score,为音准打分.

获取全局响度信息

getLoudnessOverallFeatures

float& global_lufs // 干声整体响度值
float& global_peak; // 干声整体peak值
sdk中会对这两个参数赋值

0: 计算成功;否则返回错误码 global_lufsglobal_peak可用于编辑场景中的响度均衡。默认值为(0, 0)

获取延迟检测结果

getTimeAlignResultMs

float delay_ms 延迟值

0: 计算成功;否则返回错误码。
delay_ms 说明:mic相对ref的偏移值,正值时则是mic有延迟,

写出分析结果writeRecordInfoToFile入参为结果写出的文件路径此函数将延迟检测和响度检测等结果写入文件中。在编辑页图init中需要传入这个文件给sdk 读取值使用

资源释放

仅Java: release

用于将native层资源释放。释放前需保证已经调用stop. 释放后则不可以再调用图的方法。

C++ 示例代码

#include "sami_core_karaoke_record_graph.h"

auto message_callback = [](KaraokeMessageId id, void* info) {
    // app report log 
};

int main(int argc, char* argv[]) {
    // init params
    SAMI::KaraokeRecordSettingParam setting_param;
    setting_param.accompany_path = "/path/to/accompany.wav";
    setting_param.original_path = "/path/to/original-sing.wav";
    setting_param.sample_rate = 44100;  // should be player samplerate
    setting_param.max_block_samples = 4096; // player callback buffersize
    setting_param.message_callback = message_callback;
 
    SAMI::KaraokeRecordGraph graph;
    int ret = graph.init(setting_param);
    if(ret != 0) {
        return -1;
    }
   
    // set record callback. Must 
    // graph.setMicSourceCallback(micCallback);
   
    // set recorded vocal file path. Must
    graph.setOutVocalFileParam("/path/to/vocal.wav");
   
    // turn on aec if needed. Succeed if ret == 0
    ret = graph.setAECParam("/path/to/aec.model");

    // turn on time align of music and vocal if needed. Succeed if ret == 0
    ret = graph.setTimeAlignParam("/path/to/time_align.model");
   
    // turn on vocal Londness detect if needed. Succeed if ret == 0
    ret = graph.openVocalLoudnessExtractor();
   
    // turn on pitch SingScore. Succeed if ret == 0
    graph.setSingScoreParam(1,
                         "/path/to/lyric.krc",
                         "/path/to/song.mid");
   
    // prepare, should be called once
    graph.prepare();
   
    // start the graph
    graph.play();
   
    // push mic data to sdk
    std::thread recordThread =  std::thread([&](){
        float** in_data; // record data
        int record_channel = 1; // maybe 2
        bool interleaved = false; // maybe true
        int frame = 0;
        while (recording_){    
            // copy date from device
            get_buffer_from_devices(in_data, &record_channel, &interleaved, &frame); // shoule be implemented
     
            graph.pushMicAudioData(in_data, num_channels, frame, interleaved);
        }
    });

    // mock play thread
    std::thread playThread = std::thread([&](){
        data[0] = new float[setting_param.max_block_samples];
        data[1] = new float[setting_param.max_block_samples];
        while (playing_) {
            int frames = graph.pullAudioData(data, setting_param.max_block_samples);
            // play
        }
        delete [] data[0];
        delete [] data[1];
    });
   
   // mock UI thread: get realTimeScore result and show the result
   std::thread scoreUIThread = std::thread([&](){
      while (playing_) {
          usleep(512.0 / setting_param.sample_rate * 1000.0);
          SAMICoreMulDimSingScoringRealtimeInfo info;
          int ret = graph.getRealTimeScoreInfo(info);
          // show the result
      }
   });
   
   // mock UI Interaction thread: update monitor vocal volume、update bgm mode etc.
   std::thread UIInteractionThread = std::thread([&](){
       while(playing_) {
           // update monitor vocal volume
           graph.updateMonitorVocalVolume(10);
           
           // update monitor bgm volume
           graph.updateMonitorBGMVolume(10);
           
           // switch bgm mode: Accompany or Original
           graph.switchBGMMode(KaraokeBGMMode::Original);
           
           // update bgm pitch
           graph.updateBGMPitch(4);
       }
   });
   
   {
       // pause the graph. After paused, the pullAudioData will get all zeros
       graph.pause();
       
       // resume again
       graph.play();
    }
   
    playThread.join();
    scoreUIThread.join();
    UIInteractionThread.join();
   
    // stop
    graph.stop();
   
    float loudness = 0;
    float peak = 0;
    ret = graph.getLoudnessOverallFeatures(loudness, peak);
    printf("vocal loudness: status = %d, lufs = %f, peak = %f\n", ret, loudness, peak);

    float delay_ms = 0;
    ret = graph.getTimeAlignResultMs(delay_ms);
    printf("time align result: %d, :%f\n", ret, delay_ms);

    SAMICoreMulDimSingScoringOverallInfo info;
    ret = graph.getOverallScoreInfo(info);
    printf("overall_score_info result: status = %d, note_score = %f \n", ret, info.note_score);
   
    // after stopped, write some result to json file, which will be used in EditPage 
    graph.writeRecordInfoToFile("path/to/record_info.json");
  
    return 0;
}

OC 示例代码

#include "SAMICoreKaraokeRecord.h"
#include "SAMICore.h"

int main() {   
    // init param
    SAMICore_KaraokeRecordSettingParam *param = [[SAMICore_KaraokeRecordSettingParam alloc] init];
    std::string accompany_path = "/path/to/accompany.wav";
    param.accompany_path = accompany_path.c_str();
    std::string original_path = "/path/to/original.wav";
    param.original_path = original_path.c_str();
    param.sample_rate = 44100;  // play samplerate
    param.max_block_samples = 4096; // player callback buffersize
    NSString *documentFilePath = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory,NSUserDomainMask,YES).firstObject;
    NSString *extra_config = [NSString stringWithFormat:@"{ \"backend_config\":{ \"need_backend\":true, \"loop\":false, \"dump_path\": \"%@\" } }", documentFilePath];
    param.extra_config = [extra_config UTF8String];
    param.message_callback = ^(SAMICore_KaraokeMessageId id, NSDictionary* info) {
        // app report log 
    };

    // create graph object
    SAMICoreKaraokeRecord* graph = [SAMICoreKaraokeRecord alloc];
    ret = [graph initRecordingGraphWithSettingParam:param];
    if(ret != 0) {
        return ret;
    }

    // turn on aec if needed
    std::string aec_model_path = "/path/to/aec.model";
    ret = [graph setAECParam:aec_model_path.c_str()];

    // turn on time_align if needed
    std::string time_align_model_path = "/path/to/time_align.model";
    ret = [graph setTimeAlignParam:time_align_model_path.c_str()];

    // turn on vocal volume detect if needed
    ret = [graph openVocalLoudnessExtractor];

    // turn on singscore if needed
    std::string lyric_path = "/path/to/song.krc";
    std::string midi_path = "/path/to/song.mid";
    ret = [graph setSingScoreParam:1 lyric_path:lyric_path.c_str() midi_path:midi_path.c_str()];
    
    // set vocal file saved path
    std::string out_vocal_path = "/path/to/vocal.wav";
    ret = [graph setOutVocalFileParam:out_vocal_path.c_str()];

    [graph prepare];
    [graph play];
    
    // mock UI thread
    std::thread UIThread = std::thread([&]() {
        while (playing_) {
            SAMICore_MulDimSingScoringRealtimeInfo* info = [[SAMICore_MulDimSingScoringRealtimeInfo alloc] init];
            int ret = [graph getRealTimeScoreInfo:info];   
        }
    });

    // mock interactive thread
    std::thread userInteractiveThread = std::thread([&]() {
        while (playing_) {
            std::this_thread::sleep_for(std::chrono::seconds(1));
            // update vocal volume
            [graph updateMonitorVocalVolume:(+5)];
            std::this_thread::sleep_for(std::chrono::seconds(1));
            // update bgm volume
            [graph updateMonitorBGMVolume:(+5)];
            std::this_thread::sleep_for(std::chrono::seconds(1));
            // switch bgm mode
            [graph switchBGMMode:SAMICore_KaraokeBGMMode_Original];
            
            std::this_thread::sleep_for(std::chrono::seconds(1));
            [graph switchBGMMode:SAMICore_KaraokeBGMMode_Accompany];
            
            // update bgm pitch
            std::this_thread::sleep_for(std::chrono::seconds(1));
            [graph updateBGMPitch:4];
        }
    });
   
    // stop the graph
    [graph stop];
   
    // get some global information
    SAMICore_MulDimSingScoringOverallInfo *overallInfo;
    [graph getOverallSingScoreInfo:overallInfo];

    float global_loudness = 0;
    float global_peak = 0;
    ret = [graph getLoudnessOverallFeatures:&global_loudness global_peak:&global_peak];
    
    float delay_ms = 0;
    ret = [graph getTimeAlignResultMs:&delay_ms];
    
    // write some golbal information to json file
    [graph writeRecordInfoToFile:"/path/to/record_info.json"];

    return 0;
 }

Java 示例代码

import com.mammon.audiosdk.SAMICoreKaraokeRecord;
import com.mammon.audiosdk.structures.SAMICoreMulDimSingScoringOverallInfo;
import com.mammon.audiosdk.structures.SAMICoreMulDimSingScoringRealtimeInfo;

public class SAMIKaraokeRecordDemo {
    private final SAMICoreKaraokeRecord recordGraphObj = new SAMICoreKaraokeRecord();
    private Thread UIInteractionThread = null;

    // build record graph and prepare
    public void prepare() {
        // res
        String accompany_path = "/path/to/accompany.wav";
        String original_path = "/path/to/original.wav";
        String midi_path = "/path/to/karaoke.mid";
        String krc_path = "/path/to/karaoke.krc";
        String vocal_path = "/path/to/karaoke_vocal.wav";
        String aec_model = "/path/to/aec.model";
        String time_align_model = "/path/to/time_align.model";
        String record_result_path = "/path/to/record_info.json";
    
        // param
        SAMICoreKaraokeRecord.KaraokeRecordParamSetting setting = new SAMICoreKaraokeRecord.KaraokeRecordParamSetting();
        setting.accompany_path = accompany_path;
        setting.original_path = original_path;
        setting.sample_rate = 44100; // 16000/44100/48000
        setting.max_block_samples = 1024; // player callback buffersize
        setting.extra_config = "{ \"backend_config\":{ \"need_backend\":true, \"loop\":false, \"dump_path\": \"/sdcard/Download\" }}";
        setting.message_callback = new SAMICoreKaraokeMessageCallBack() {
            @Override
            public void MessageTracker(SAMICoreKaraokeMessageId id, SAMICoreKaraokeInfo info) {
                // app report log 
            }
        };
    
        Log.i(TAG, "karaoke record param: " + setting.toString());
    
        // init graph
        int ret = recordGraphObj.init(setting);
        if (ret != 0) {
            Log.e(TAG, "startTest: record graph init failed");
            return;
        }
    
        ret = recordGraphObj.setAECParam(aec_model);
        if (ret == 0) {
            Log.i(TAG, "enable_aec and init succeed\n");
        } else {
            Log.e(TAG, "enable_aec but init failed\n");
        }
    
        ret = recordGraphObj.setTimeAlignParam(time_align_model);
        if (ret == 0) {
            Log.i(TAG, "enable_time_align and init succeed\n");
        } else {
            Log.e(TAG, "enable_time_align but init failed\n");
        }
    
        ret = recordGraphObj.setSingScoreParam(1, krc_path, midi_path);
        if (ret == 0) {
            Log.i(TAG, "enable_sing_score and init succeed\n");
        } else {
            Log.e(TAG, "enable_sing_score but init failed\n");
        }
    
        ret = recordGraphObj.openVocalLoudnessExtractor();
        if (ret == 0) {
            Log.i(TAG, "enable_vocal_loudness and init succeed\n");
        } else {
            Log.e(TAG, "enable_vocal_loudness but init failed\n");
        }
    
        recordGraphObj.setOutVocalFileParam(vocal_path);
    
        ret = recordGraphObj.prepare();
        if (ret == 0) {
            Log.i(TAG, "graph prepare succeed\n");
        } else {
            Log.e(TAG, "graph prepare failed\n");
        }
    }
    
    public void play() {
        // ui interaction thread, mock function
        UIInteractionThread = new Thread() {
            // update monitor vocal volume
            recordGraphObj.updateMonitorVocalVolume(10);
           
            // update monitor bgm volume
            recordGraphObj.updateMonitorBGMVolume(10);
           
            // switch bgm mode: Accompany or Original
            recordGraphObj.switchBGMMode(SAMICoreKaraokeRecord.KaraokeBGMMode.Original);
           
            // update bgm pitch
            recordGraphObj.updateBGMPitch(4);
        }
        
        UIInteractionThread.start();
    }
    
    public void stop() {
        try {
            UIInteractionThread.join();
        } catch (InterruptedException e) {
            e.printStackTrace();
        };

        ret = recordGraphObj.stop();
        if (ret == 0) {
            System.out.println("graph stop succeed");
        } else {
            System.out.println("graph stop failed");
        }

        ret = recordGraphObj.writeRecordInfoToFile(record_result_path);
        if (ret == 0) {
            System.out.println("graph set info file succeed");
        } else {
            System.out.println("graph set info file failed");
        }

        float[] loudness = new float[1];
        float[] peak = new float[1];
        ret = recordGraphObj.getLoudnessOverallFeatures(loudness, peak);
        System.out.println("vocal loudness: status = " + ret + ", lufs = " + loudness[0] + " , peak = " + peak[0]);
        
        float[] delay_ms = new float[1];
        ret = recordGraphObj.getTimeAlignResultMs(delay_ms);
        System.out.println("time align result: " + ret + ":" + delay_ms[0]);

        SAMICoreMulDimSingScoringOverallInfo info = new SAMICoreMulDimSingScoringOverallInfo();
        ret = recordGraphObj.getOverallScoreInfo(info);
        System.out.println("overall_score_info result: status = " + ret + ", note_score = " + info.note_score);
    }
   
}

编辑页

注意

编辑页请求的数据可以用于播放或者保存到文件

alt

接口说明

C++:头文件为sami_core_karaoke_edit_graph.h,调用类SAMI::KaraokeEditGraph
OC :头文件为SAMICoreKaraokeEdit.h,接口名称及功能与CPP对应
Java:SAMICoreKaraokeEdit.java,接口名称及功能与C++对应,部分参数和返回值不同,文档中标出

功能
接口名称
接口参数说明
接口返回值说明 及函数补充说明

初始化

init

C++:

struct KaraokeEditSettingParam {
    std::string vocal_path;  // 干声路径,拍摄页时保存的音频文件
    std::string bgm_path; // 伴奏路径
    std::string record_result_path; // 拍摄页stop后保存的一个json文件,包含
    int sample_rate;  //编辑页播放的采样率
    int max_block_samples; // 播放器每次请求的每通道数据最大采样点数
    std::string extra_config; // 额外设置,如开启录播等
    KaraokeMessageCallback message_callback; //埋点回调,可获取内部埋点信息
};
typedef std::function<void(KaraokeMessageId id, void* info)> KaraokeMessageCallback;

OC: SAMICore_KaraokeEditSettingParam
Java:SAMICoreKaraokeEdit.KaraokeEditSettingParam

0: 成功;否则返回具体错误码,并打印错误信息

注意:

  1. max_block_samples 不得超过65536,否则会报错

  2. 目前sdk中支持的伴奏文件格式,仅包括:wav/mp3

  3. extra_config需遵循json格式传入,如"{ \"backend_config\":{ \"need_backend\":true, \"loop\":false, \"dump_path\": \"xxx/xxx/" }}";

  4. sdk录播仅支持android/ios端

开启降噪功能

setDenoiseModelPath

std::string denoise模型的路径

0: 成功;否则返回具体错误码。目前仅支持unet_denoise_44k_music_model_v1.0.model。注意,设置成功之后则默认打开降噪功能

降噪的开关

setUseDenoise

bool 是否使用降噪功能

0: 成功;否则返回具体错误码。注意:需要在开启降噪功能成功之后调用时才生效

伴奏开启响度均衡功能

setBGMLoudnormInfo

C++/Java: SAMICoreLoudnormProperty
OC: SAMICore_LoudnormProperty

struct SAMICoreLoudnormProperty {
    float source_lufs; // 原始响度,lufs
    float source_peak; // 原始峰值
    float target_lufs; // 目标响度
}

人声开启响度均衡功能

setVocalLoudnormInfo

C++/Java: SAMICoreLoudnormProperty
OC: SAMICore_LoudnormProperty

若拍摄页开启了响度检测,且编辑页传入了record_result_path,则会默认开启人声响度均衡

设置/切换音效updateEffectFilePathstd::string 资源文件路径0: 成功;否则返回具体错误码。

设置人声伴奏对齐值

setVocalOffsetMs

float 毫秒级时间

调整人声和伴奏的偏移值(-1000至+1000范围)。可以使用延迟检测的结果作为参考值

获取总时长getTotalDurationMs编辑页可播放的总时长 单位毫秒,这里取干声文件的时长
获取当前进度getCurrentPositionMs当前播放进度,毫秒

播放线程中拉取播放数据

pullAudioData

float** 目前仅支持双声道非交叉存储
int num_samples pull的数据长度

返回获取到的每通道数据采样点数。异常时返回-1。 注意:num_samples不得超过max_block_samples,否则直接返回-1。

内部环境准备prepare0: 成功;否则返回具体错误码。内部环境准备。仅需调用一次
开始play0: 成功;否则返回具体错误码。可在prepare之后或者pause之后调用
进度控制seekfloat seekMs 毫秒级时间0: 成功;否则返回具体错误码。
暂停pause0: 成功;否则返回具体错误码。暂停后再pullAudioData时全部返回静音数据
结束stop0: 成功;否则返回具体错误码。结束之后,不可以再拉取数据或者设置参数
更新人声音量updateVocalVolumefloat value_db取值范围[-70, +35], -70时为静音
更新伴奏音量updateBGMVolumefloat value_db取值范围[-70, +35], -70时为静音

导出指定片段音频

exportAudioDataToAudioFile

std::string 导出文件路径
float 截取片段开始时间,毫秒级单位,传入超过干声文件长度则报错
float 截取片段结束时间,毫秒级单位,传入超过干声文件长度默认取干声文件长度
C++/Java/OC:std::function<void(float)>/SAMICoreKaraokeProgressCallbackprogress_callback/SAMICore_KaraokeProgressCallback 导出进度回调,可通过此回调获取当前导出进度以做ui展示

0: 成功;否则返回具体错误码。android支持wav、mp3,ios支持aac、wav。必须在准备状态(graph已调用prepare未调用play的状态)或暂停状态(graph调用pause的状态)调用,不可在调用pullAudioData时调用。

C++ 示例代码

#include "sami_core_karaoke_edit_graph.h"
#include "sami_core.h"

auto message_callback = [](KaraokeMessageId id, void* info) {
    // app report log 
};

int main(int argc, char* argv[]) {
    // ypu must notify your token first

    // params
    SAMI::KaraokeEditSettingParam setting_param;
    setting_param.vocal_path = "/path/to/vocal.wav";
    setting_param.bgm_path = "/path/to/bgm.wav";
    setting_param.record_result_path = "/path/to/record_info.json";
    setting_param.sample_rate = 44100;  // player samplerate
    setting_param.max_block_samples = 1024; // player callback buffersize
    setting_param.message_callback = message_callback;
   
    SAMI::KaraokeEditGraph graph;
    int ret = graph.init(setting_param);
    if(ret != 0) {
        return -1;
    }

    // turn on denoise if needed. Succeed if ret == 0
    ret = graph.setDenoiseModelPath("/path/to/denoise.model");
   
    // use or change effect 
    ret = graph.updateEffectFilePath(preset_dir + effect_path.front());
    assert(ret == 0);

    // turn on bgm Loudnorm if needed and bgm lundness given. Succeed if ret == 0
    SAMICoreLoudnormProperty bgm_loudnorm{-24, -8.09, -16.0};
    ret = graph.setBGMLoudnormInfo(bgm_loudnorm);

    graph.prepare();
    graph.play();
    
    float totalDuration = graph.getTotalDurationMs();
    
    // mock monitor get data with writing to file
    std::thread playThread = std::thread([&]() {
        float* data[2];
        data[0] = new float [setting_param.max_block_samples];
        data[1] = new float [setting_param.max_block_samples];
        while(graph.getCurrentPositionMs() < totalDuration) {
            int frames = graph.pullAudioData(data, setting_param.max_block_samples);
            // play
        }
        delete [] data[0];
        delete [] data[1];
    });

    // mock user interactive thread
    std::thread userInteractiveThread = std::thread([&]() {
        while(/* playing */) {
            std::this_thread::sleep_for(std::chrono::seconds(5));
            // update vocal volume
            graph.updateVocalVolume(+10);
            
            // update bgm volume
            graph.updateBGMVolume(+10);
            
            // change effect
            graph.updateEffectFilePath(preset_dir + effect_path[(effect_index++) % effect_path.size()]);
            
            // update bgm and vocal offset
            graph.setVocalOffsetMs(-200);
        }
    });
    
    playThread.join();
    userInteractiveThread.join();
    
    graph.stop();
    
    return 0;
}

OC 示例代码

#import "SAMICoreKaraokeEdit.h"

{
    // 1. create edit graph
    SAMICore_KaraokeEditSettingParam* param = [[SAMICore_KaraokeEditSettingParam alloc] init];

    std::string bgm_path = "/path/to/accompany.wav";
    std::string vocal_path = "/path/to/vocal.wav";
    std::string record_result_path = "/path/to/record_info.json"; //拍摄页的分析结果写入到的文件

    param.bgm_path = bgm_path.c_str();
    param.vocal_path = vocal_path.c_str();
    param.record_result_path = record_result_path.c_str();
    param.sample_rate = 44100;
    param.max_block_samples = 1024; // player callback buffersize
    NSString *documentFilePath = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory,NSUserDomainMask,YES).firstObject;
    NSString *extra_config = [NSString stringWithFormat:@"{ \"backend_config\":{ \"need_backend\":true, \"loop\":false, \"dump_path\": \"%@\" } }", documentFilePath];
    param.extra_config = [extra_config UTF8String];
    param.message_callback = ^(SAMICore_KaraokeMessageId id, NSDictionary* info) {
        // app report log 
    };

    SAMICoreKaraokeEdit* graph = [SAMICoreKaraokeEdit alloc]; 
    int ret = [graph initEditGraphWithSettingParam:param];
    if(ret != 0) {
        return ret;
    }
    
    // total durtion ms
    float durationMS = [graph getTotalDuratioinMs];

    // 2. use denoise if needed. Succeed if ret == 0
    std::string denoise_model_path = "/path/to/denoise.model";
    ret = [graph setDenoiseModelPath: denoise_model_path.c_str()];

    // 3. use effect. Succeed if ret == 0
    std::string effect_path = "/path/to/effect";
    ret = [graph updateEffectFilePath:effect_path.c_str()];

    // 4. turn on bgm Loudnorm if needed and bgm lundness given. Succeed if ret == 0
    SAMICore_LoudnormProperty *property = [[SAMICore_LoudnormProperty alloc] init];
    property.source_lufs = -24;
    property.source_peak = -8;
    property.target_lufs = -16;
    ret = [graph setBGMLoudnormInfo:property];

    // begin run the graph
    [graph prepare];
    [graph play];

    // mock user interactive thread
    std::thread userInteractiveThread = std::thread([&]() {
        while(playing_) {
            std::this_thread::sleep_for(std::chrono::seconds(5));
            [graph updateVocalVolume: +10];  // 调节人声音量
            [graph updateBGMVolume: +10];    // 调节伴奏音量
            [graph setUseDenoise:false];   // 关闭降噪
            [graph setVocalOffsetMs:200];  // 设置人声伴奏便偏移,UI 
        }
    });

    playing_ = false;
    
    playThread.join();
    userInteractiveThread.join();
    
    // stop
    [graph stop];
}

Java示例代码

import com.mammon.audiosdk.SAMICoreKaraokeEdit;

public class SAMIKaraokeEditDemo {
    private final SAMICoreKaraokeEdit editGraphObj = new SAMICoreKaraokeEdit();
    private Thread UIInteractionThread = null;
    
    public void prepare() {
        String vocal_path = "/path/to/vocal.wav";
        String bgm_path = "/path/to/accompany.wav";
        String record_result_path = "/path/to/record_info.json";
        String denoise_model_path = "/path/to/denoise.model";
        String effect_path = "/path/to/minions.dat";
    
        SAMICoreKaraokeEdit.KaraokeEditSettingParam setting = new SAMICoreKaraokeEdit.KaraokeEditSettingParam();
        setting.vocal_path = vocal_path;
        setting.bgm_path = bgm_path;
        setting.record_result_path = record_result_path;
        setting.sample_rate = samplerateInConfig;
        int bufferSize = 1024;
        setting.max_block_samples = bufferSize; // player callback buffersize
        setting.extra_config = "{ \"backend_config\":{ \"need_backend\":true, \"loop\":false, \"dump_path\": \"/sdcard/Download\" } }";
        setting.message_callback = new SAMICoreKaraokeMessageCallBack() {
            @Override
            public void MessageTracker(SAMICoreKaraokeMessageId id, SAMICoreKaraokeInfo info) {
                // app report log 
            }
        };
    
        Log.i(TAG, "karaoke edit param: " + setting.toString());
    
        // init graph
        int ret = editGraphObj.init(setting);
        if (ret != 0) {
            Log.e(TAG, "startTest: edit graph init failed");
            return;
        }
    
        ret = editGraphObj.setDenoiseModelPath(denoise_model_path);
        if (ret == 0) {
            Log.i(TAG, "enable_denoise and init succeed\n");
        } else {
            Log.e(TAG, "enable_denoise but init failed\n");
        }
    
        ret = editGraphObj.updateEffectFilePath(effect_path);
        if (ret == 0) {
            Log.i(TAG, "use effect and init succeed\n");
        } else {
            Log.e(TAG, "use effect but init failed\n");
        }
    
        ret = editGraphObj.prepare();
        if (ret == 0) {
            Log.i(TAG, "graph prepare succeed\n");
        } else {
            Log.e(TAG, "graph prepare failed\n");
        }
    }
    
    public void play() {
        editGraphObj.play();
        // ui interaction thread, mock function
        UIInteractionThread = new Thread() {
            // update vocal volume
            editGraphObj.updateVocalVolume(+10);
                
            // update bgm volume
            editGraphObj.updateBGMVolume(+10);
            
            // change effect
            editGraphObj.updateEffectFilePath(new_effect_path);
            
            // update bgm and vocal offset
            editGraphObj.setVocalOffsetMs(-200);
        }
    }
    
    public void stop() {
        try {
            UIInteractionThread.join();
        } catch (InterruptedException e) {
            e.printStackTrace();
        };
        
        editGraphObj.pause();
        editGraphObj.exportAudioDataToAudioFile(file_path, 0, -1, new SAMICoreKaraokeEdit.SAMICoreKaraokeProgressCallback() {
            @Override
            public void progressCallback(float current_time) {
                Log.i(TAG, "current Time is " + current_time + "ms");
            }
        }); 
        
        editGraphObj.stop();
    }
}

工具函数

midi解析

C++版

#include "karaoke_utils.h"
// static method to parse midi file. Return a state value and midi content
std::tuple<int, MidiFileContent> KaraokeUtils::parseMidiFile(std::string midiFilePath)


//===== infomation of MidiFileContent ===========
struct MidiPitchInfo {
    int startMs{-1};
    int durationMs{-1};
    int pitch{0};
};
using MidiFileContent = std::vector<MidiPitchInfo>;

Java版 调用示例

import com.mammon.audiosdk.MammonIo;

MammonIo io = new MammonIo();
MammonMidiNote[] notes = io.readMidiNotesFromFile(midi_path, -1, true);

krc解析

#include "karaoke_utils.h"

// static method to parse krc file. Return a state value and krc's content
std::tuple<int, KrcFileContent> parseLyricFile(const std::string& krcFilePath);


//===== infomation of KrcFileContent ===========

// one word info in krc files
struct KrcWordInfo {
    int startOffsetMs{-1};
    int durationMs{-1};
    std::string word;
};

// one sentence content which contains several words
struct KrcLineContent {
    int lineStartMs{-1};
    int lineDuration{-1};
    std::string lyricStr;
    std::vector<KrcWordInfo> lineWordsInfo;
};

using KrcFileContent = std::vector<KrcLineContent>;

注意事项:

  1. 音量调整

说明: 拍摄页和编辑页均支持更新人声和伴奏音量。这里做详细说明。
传入参数为增益值, 单位dB. 0dB即输入输出不变。我们支持的理论范围是[-70, +35], 超出此范围则会按照对应边界值生效。
用户可调范围不必要[-70, +35]这么大,可以是[-30, +6]即可。或者稍微有调整。加太大时,整体音量很大,但是为了避免爆音,就用limiter压下来了,就会将底噪等声音凸显出来。
音量滑杆:建议将音量滑杆与可调范围建立线性映射,调节时感受更顺滑。注意,在滑杆0位置,需要传入-70(表示静音)


  1. 拍摄页保存json说明

拍摄页stop之后,可以通过调用 record_graph.writeRecordInfoToFile(std::string)将拍摄页的部分状态和结果写到一个json文件中,便于编辑页使用。字段内容可扩展,目前保存结果如下:

{
    "bgm_latency_ms":"23.219955", // 拍摄页bgm延时时长
    "bgm_pitch_shift_semitone_normalised":"0.000000", // 拍摄页bgm调整的音高的归一值
    "startTimeMs": 0,       // 音频开始时间,相对于bgm可能有偏移, 比如从第10s开始录制
    "endTimeMs": 60000,     // 音频结束时间
    "enableAEC": 1,         // 回声消除功能是否创建成功。成功为1,否则为0
    "enableTimeAlign": 1,   // 延迟检测功能是否创建成功。成功为1,否则为0
    "enableLoudnorm": 1,    // 响度检测功能是否创建成功。成功为1,否则为0
    "enableSingScore": 1,   // 音高打分功能是否创建成功。成功为1,否则为0
    "loudnessResult":{
        "status": 0,        // 响度检测是否结果正常。正常为0,否则为错误码
        "peak": 1.0,        // 响度检测结果:音频中的幅值峰值
        "global_loudness": -20 // 响度检测结果:全局lufs
    },
    "scoreResult": {
        "status": 0,        // 打分状态码。0为正常
        "note_score": 45.0, // 音高总分数
        "emotion_score": 0, // 情感总分数,目前为0
        "rhythm_score":0    // 节奏总分数,目前为0
    },
    "timeAlignResult": {
        "status": 0,        // 延迟检测状态码。0为正常
        "delay_ms": 200     // 延迟检测结果:人声和伴奏的偏移值ms
    }
}

错误码详解

详细可查看sami_core_error_code.h

错误码名称错误码含义
SAMI_KARAOKE_IMPL_ERROR200001内部类对象为空,出现原因常为鉴权失败
SAMI_KARAOKE_GRAPH_STATE_ERROR200002内部状态错误,出现原因常为调用顺序错误
SAMI_KARAOKE_PARAM_ERROR200003传入参数错误,请检查参数是否正确
SAMI_KARAOKE_CONTEXT_ERROR200004内部对象context为空,出现原因常为未prepare或prepare失败
SAMI_KARAOKE_SET_VOLUME_FAILED200005设置音量发生错误,请检查音量参数是否正确
SAMI_KARAOKE_RECORD_PRE_PROCESS_ERROR200006拍摄页内部处理器错误,出现原因常为未prepare或prepare失败
SAMI_KARAOKE_RECORD_CREATE_AUDIO_FILE_FAILED200007拍摄页创建干声文件失败,请检查文件路径、权限是否正确
SAMI_KARAOKE_RECORD_AUDIO_FILE_ERROR200008拍摄页干声文件为空,请检查是否初始化干声文件
SAMI_KARAOKE_RECORD_GET_TIME_ALIGN_RESULT_FAILED200009拍摄页获取延迟检测结果失败,请检查延迟检测初始化是否成功
SAMI_KARAOKE_RECORD_GET_SING_SCORE_RESULT_FAILED200010拍摄页获取打分结果失败,请检查打分初始化是否成功
SAMI_KARAOKE_RECORD_GET_LOUDNESS_RESULT_FAILED200011拍摄页获取响度检测结果失败,请检查响度检测初始化是否成功
SAMI_KARAOKE_RECORD_SWITCH_AUDIO_MODE_FAILED200012拍摄页切换bgm模式失败,请检查传入模式是否正确
SAMI_KARAOKE_EDIT_AUDIO_FILE_FORMAT_NOT_SUPPORT200013编辑页传入导出文件格式不支持,请检查传入格式是否在sdk支持范围内
SAMI_KARAOKE_EDIT_PARSE_JSON_ERROR200014编辑页解析json文件失败,请检查传入json文件是否正确
SAMI_KARAOKE_EDIT_VOCAL_FILE_INVAILID200015编辑页传入干声文件无效,请检查传入干声文件是否正确
SAMI_KARAOKE_EDIT_DENOISE_PROCESSER_ERROR200016编辑页降噪处理器为空,请检查降噪初始化是否成功
SAMI_KARAOKE_EDIT_LOUDNORM_PROCESSER_ERROR200017编辑页音量均衡处理器为空,请检查音量均衡初始化是否成功
SAMI_KARAOKE_EDIT_EFFECT_PROCESSER_ERROR200018编辑页音效处理器为空,请检查音效初始化是否成功
SAMI_KARAOKE_EDIT_BYTETUNER_PROCESSER_ERROR200019编辑页修音处理器为空,请检查修音初始化是否成功
SAMI_KARAOKE_EDIT_SET_OFFSET_FAILED200020编辑页设置干声偏移失败,请检查传入偏移值是否正确
SAMI_KARAOKE_SET_PITCH_SHIFT_FAILED200021设置bgm音高失败,请检查传入半音值是否正确
SAMI_KARAOKE_PARSE_EXTRA_CONFIG_ERROR200022解析extra_config失败,请检查传入extra_config字符串是否为标准json格式
SAMI_KARAOKE_INIT_BACKEND_FAILED200023初始化录播器失败,请检查传入extra_config是否正确