You need to enable JavaScript to run this app.
Lake AI Service

Lake AI Service

Copy page
Download PDF
Audio preprocessing
Audio format standardization
Copy page
Download PDF
Audio format standardization

Operator introduction

Description

Audio standardization module – standardizes audio to a specified format (sampling rate, channels, loudness, and more)

Key features

  • Supports resampling
  • Supports channel unification (for example, converting to mono)
  • Supports loudness normalization (target dBFS, with gain range limitation)
  • Default input and output audio is in byte format

Usage scenarios

  • Unified processing of audio data from multiple sources
  • Preprocessing for downstream ASR/TTS models
  • Audio standardization step in multimodal content processing

Daft invocation

Operator parameters

Input

Input column name

Description

audio_col

Byte array of input audio

Output

Processed audio result; returns None if failed

Parameters

If a parameter does not have a default value, it is required

Parameter name

Type

Default value

Description

target_sr

int or None

None

Target sampling rate (Hz), for example 16000; if None, retains the original sampling rate

target_channels

int or None

None

Target number of channels, for example 1 for mono; if None, retains the original number of channels

target_dbfs

float or None

None

Target loudness (dBFS); if None, loudness normalization is not performed

target_gain_range

list

[-3, 3]

Allowed gain range for normalization, for example [-3, 3]

Examples

The following code demonstrates how to use the Daft operator AudioStandardization to standardize audio data to a specified format, including sampling rate, channels, and loudness.

# Copyright (c) Beijing Volcano Engine Technology Ltd.

from __future__ import annotations

import os

import daft
from daft import col
from daft.las.functions.audio.audio_standardization import AudioStandardization
from daft.las.functions.udf import las_udf

if __name__ == "__main__":
    TOS_TEST_DIR_URL = os.getenv("TOS_TEST_DIR_URL", "las-cn-beijing-public-online.tos-cn-beijing.volces.com")

    if os.getenv("DAFT_RUNNER", "native") == "ray":
        import logging

        import ray

        def configure_logging():
            logging.basicConfig(
                level=logging.INFO,
                format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
                datefmt="%Y-%m-%d %H:%M:%S",
            )
            logging.getLogger("tracing.span").setLevel(logging.WARNING)
            logging.getLogger("daft_io.stats").setLevel(logging.WARNING)
            logging.getLogger("DaftStatisticsManager").setLevel(logging.WARNING)
            logging.getLogger("DaftFlotillaScheduler").setLevel(logging.WARNING)
            logging.getLogger("DaftFlotillaDispatcher").setLevel(logging.WARNING)

        ray.init(dashboard_host="0.0.0.0", runtime_env={"worker_process_setup_hook": configure_logging})
        daft.set_runner_ray()

    daft.set_execution_config(actor_udf_ready_timeout=600)
    daft.set_execution_config(min_cpu_per_task=0)

    # Example input data
    samples = {"audio_path": [f"https://{TOS_TEST_DIR_URL}/public/archive/audio_standardization/.aac"]}

    # Construct Daft DataFrame
    df = daft.from_pydict(samples)

    # Apply AudioStandardization operator
    df = df.with_column(
        "standardized_audio",
        las_udf(
            AudioStandardization,
            construct_args={
                "target_sr": 16000,
                "target_channels": 1,
                "target_dbfs": -20.0,
                "target_gain_range": [-3.0, 3.0],
            },
            num_cpus=1,
            batch_size=1,
            concurrency=2,
        )(col("audio_path")),
    )

    df.show()
    # ╭────────────────────────────────┬────────────────────────────────╮
    # │ audio_path                     ┆ standardized_audio             │
    # │ ---                            ┆ ---                            │
    # │ String                         ┆ Binary                         │
    # ╞════════════════════════════════╪════════════════════════════════╡
    # │ https://las-public-data-qa.to… ┆ b"RIFF\xd0\x9a\x08\x00WAVEfmt… │
    # ╰────────────────────────────────┴────────────────────────────────╯
Last updated: 2026.05.12 19:06:31