You need to enable JavaScript to run this app.
导航

聚合函数

最近更新时间2022.02.28 12:33:06

首次发布时间2022.02.28 12:33:06

请注意:
下文中的一些示例引用自 ClickHouse 社区文档 并经过一定修改确保可以在 ByteHouse 中正常使用。

any

Selects the first encountered value.
The query can be executed in any order and even in a different order each time, so the result of this function is indeterminate.
To get a determinate result, you can use the ‘min’ or ‘max’ function instead of ‘any’.

In some cases, you can rely on the order of execution. This applies to cases when SELECT comes from a subquery that uses ORDER BY.

When a SELECT query has the GROUP BY clause or at least one aggregate function, ByteHouse (in contrast to MySQL) requires that all expressions in the SELECT , HAVING , and ORDER BY clauses be calculated from keys or from aggregate functions. In other words, each column selected from the table must be used either in keys or inside aggregate functions. To get behavior like in MySQL, you can put the other columns in the any aggregate function.

Syntax

any(column)

Arguments

  • column – The column name.

Returned value

  • first value encontered. Type is same as input column.

Example

SELECT any(number) FROM numbers(10);
┌─any(number)─┐
│ 0           │
└─────────────┘

anyHeavy

Selects a frequently occurring value using the heavy hitters algorithm. If there is a value that occurs more than in half the cases in each of the query’s execution threads, this value is returned. Normally, the result is nondeterministic.

Syntax

anyHeavy(column)

Arguments

  • column – The column name.

Returned value

  • most frequent value. Type is same as input column.

Example

CREATE TABLE IF NOT EXISTS test.functionAnyHeavy(id Int) ENGINE=CnchMergeTree() ORDER BY id;
INSERT INTO test.functionAnyHeavy values (1),(1),(1),(2),(3);
SELECT anyHeavy(id) FROM test.functionAnyHeavy;
┌─anyHeavy(id)─┐
│ 1            │
└──────────────┘

anyLast

Selects the last value encountered.
The result is just as indeterminate.

Syntax

anyLast(column)

Arguments

  • column – The column name.

Returned value

  • last value encountered. Type is same as input column.

Example

SELECT anyLast(number) FROM numbers(10);
┌─anyLast(number)─┐
│ 9               │
└─────────────────┘

argMax

Calculates the arg value for a maximum val value. If there are several different values of arg for maximum values of val , returns the first of these values encountered.

Syntax

argMax(arg, val)

Arguments

  • arg — Argument.
  • val — Value.

Returned value

  • arg value that corresponds to maximum val value. Type is same as arg type.

Example

CREATE TABLE IF NOT EXISTS test.functionArgMax (user String, salary Int) Engine=CnchMergeTree() ORDER BY user;
INSERT INTO test.functionArgMax VALUES ('director',5000),('manager',3000),('worker',1000);
SELECT argMax(user, salary) FROM test.functionArgMax;
┌─argMax(user, salary)─┐
│ director             │
└──────────────────────┘

argMin

Calculates the arg value for a minimum val value. If there are several different values of arg for minimum values of val , returns the first of these values encountered.

Syntax

argMin(arg, val)

Arguments

  • arg — Argument.
  • val — Value.

Returned value

  • arg value that corresponds to minimum val value. Type is same as arg type.

Example

CREATE TABLE IF NOT EXISTS test.functionArgMin (user String, salary Int) Engine=CnchMergeTree() ORDER BY user;
INSERT INTO test.functionArgMin VALUES ('director',5000),('manager',3000),('worker',1000);
SELECT argMin(user, salary) FROM test.functionArgMin;
┌─argMin(user, salary)─┐
│ worker               │
└──────────────────────┘

avg

Calculates the arithmetic mean.

Syntax

avg(x)

Arguments

  • x — input values, must be Integer, Float or Decimal.

Returned value

  • The arithmetic mean, always as Float64.
  • NaN if the input parameter x is empty.

Example

CREATE TABLE IF NOT EXISTS test.functionAvg (x Int8) ENGINE=CnchMergeTree() ORDER BY x;
INSERT INTO test.functionAvg SELECT * FROM numbers(6);
SELECT avg(x) FROM test.functionAvg;
┌─avg(x)──┐
│ 2.5e+00 │
└─────────┘
CREATE TABLE IF NOT EXISTS test.functionAvg (x Int8) ENGINE=CnchMergeTree() ORDER BY x;
SELECT avg(x) FROM test.functionAvg;
┌─avg(x)─┐
│ NaN    │
└────────┘

corr

Calculates the Pearson correlation coefficient: Σ((x - x̅)(y - y̅)) / sqrt(Σ((x - x̅)^2) * Σ((y - y̅)^2)).
This function uses a numerically unstable algorithm. If you need numerical stability in calculations, use the corrStable function. It works slower but provides a lower computational error.

Syntax

corr(x, y)

Arguments

  • x — input values, must be Integer, Float.
  • y — input values, must be Integer, Float.

Returned value

  • Pearson correlation coefficient, always as Float64.

Example

CREATE TABLE IF NOT EXISTS test.functionCorr (x Int8, y Int8) ENGINE=CnchMergeTree() ORDER BY x;
INSERT INTO test.functionCorr VALUES (1,5),(2,4),(3,3),(4,2),(5,1);
SELECT corr(x,y) FROM test.functionCorr;
┌─corr(x, y)─┐
│ -1e+00     │
└────────────┘

covarPop

Calculates the value of Σ((x - x̅)(y - y̅)) / n.
This function uses a numerically unstable algorithm. If you need numerical stability in calculations, use the covarPopStable function. It works slower but provides a lower computational error.

Syntax

covarPop(x, y)

Arguments

  • x — input values, must be Integer, Float.
  • y — input values, must be Integer, Float.

Returned value

  • Σ((x - x̅)(y - y̅)) / n, always as Float64.

Example

CREATE TABLE IF NOT EXISTS test.functionCovarPop (x Int8, y Int8) ENGINE=CnchMergeTree() ORDER BY x;
INSERT INTO test.functionCovarPop VALUES (1,5),(2,4),(3,3),(4,2),(5,1);
SELECT covarPop(x,y) FROM test.functionCovarPop;
┌─covarPop(x, y)─┐
│ -2e+00         │
└────────────────┘

covarSamp

Calculates the value of Σ((x - x̅)(y - y̅)) / (n - 1).
This function uses a numerically unstable algorithm. If you need numerical stability in calculations, use the covarSampStable function. It works slower but provides a lower computational error.

Syntax

covarSamp(x, y)

Arguments

  • x — input values, must be Integer, Float.
  • y — input values, must be Integer, Float.

Returned value

  • Σ((x - x̅)(y - y̅)) / (n - 1), always as Float64.
  • When n <= 1, returns NaN.

Example

CREATE TABLE IF NOT EXISTS test.functionCovarSamp (x Int8, y Int8) ENGINE=CnchMergeTree() ORDER BY x;
INSERT INTO test.functionCovarSamp VALUES (1,5),(2,4),(3,3),(4,2),(5,1);
SELECT covarSamp(x,y) FROM test.functionCovarSamp;
┌─covarSamp(x, y)─┐
│ -2.5e+00        │
└─────────────────┘
CREATE TABLE IF NOT EXISTS test.functionCovarSamp (x Int8, y Int8) ENGINE=CnchMergeTree() ORDER BY x;
INSERT INTO test.functionCovarSamp VALUES (1,5);
SELECT covarSamp(x,y) FROM test.functionCovarSamp;
┌─covarSamp(x, y)─┐
│ NaN             │
└─────────────────┘

groupArray

Creates an array of argument values.
Values can be added to the array in any (indeterminate) order.

The second version (with the max_size parameter) limits the size of the resulting array to max_size elements. For example, groupArray(1)(x) is equivalent to [any (x)] .

In some cases, you can still rely on the order of execution. This applies to cases when SELECT comes from a subquery that uses ORDER BY .

Syntax

groupArray(x)
groupArray(max_size)(x)

Arguments

  • x — input values.

Returned value

  • array of values. Type is Array.

Example

select groupArray(number) from numbers(10);
┌─groupArray(number)─────────────┐
│ [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] │
└────────────────────────────────┘
select groupArray(5)(number) from numbers(10);
┌─groupArray(5)(number)─┐
│ [0, 1, 2, 3, 4]       │
└───────────────────────┘

groupBitAnd

Applies bitwise AND for series of numbers.

Syntax

groupBitAnd(expr)

Arguments

  • expr – An expression that results in UInt* type.

Return value

  • Value of the UInt* type.

Example

CREATE TABLE IF NOT EXISTS test.groupBitAnd (num UInt8) ENGINE=CnchMergeTree() ORDER BY num;
INSERT INTO test.groupBitAnd VALUES (44),(28),(13),(85);
SELECT groupBitAnd(num) FROM test.groupBitAnd;
┌─groupBitAnd(num)─┐
│ 4                │
└──────────────────┘

Bit representation of 44,28,13 and 85 are 00101100,00011100,00001101 and 01010101. The bit and of the four numbers is 00000100 which is equal to 4.

groupBitOr

Applies bitwise OR for series of numbers.

Syntax

groupBitOr(expr)

Arguments

  • expr – An expression that results in UInt* type.

Returned value

  • Value of the UInt* type.

Example

CREATE TABLE IF NOT EXISTS test.functionGroupBitOr (num UInt8) ENGINE=CnchMergeTree ORDER BY num;
INSERT INTO test.functionGroupBitOr VALUES (44),(28),(13),(85);
SELECT groupBitOr(num) FROM test.functionGroupBitOr;
┌─groupBitOr(num)─┐
│ 125             │
└─────────────────┘

Bit representation of 44,28,13 and 85 are 00101100,00011100,00001101 and 01010101. The bit or of the four numbers is 01111101 which is equal to 125.

groupBitXor

Applies bitwise XOR for series of numbers.

Syntax

groupBitXor(expr)

Arguments

  • expr – An expression that results in UInt* type.

Return value

  • Value of the UInt* type.

Example

CREATE TABLE IF NOT EXISTS test.functionGroupBitXor (num UInt8) ENGINE=CnchMergeTree ORDER BY num;
INSERT INTO test.functionGroupBitXor VALUES (44),(28),(13),(85);
SELECT groupBitXor(num) FROM test.functionGroupBitXor;

┌─groupBitXor(num)─┐
│ 104              │
└──────────────────┘

Bit representation of 44,28,13 and 85 are 00101100,00011100,00001101 and 01010101. The bit xor of the four numbers is 01101000 which is equal to 104.

groupBitmap

Bitmap or Aggregate calculations from a unsigned integer column, return cardinality of type UInt64, if add suffix -State, then return bitmap object.

Syntax

groupBitmap(expr)

Arguments

  • expr – An expression that results in UInt* type.

Return value

  • Value of the UInt64 type.

Example

SELECT groupBitmapState(number) FROM numbers(4);
┌─groupBitmap(number)─┐
│ 4                   │
└─────────────────────┘

groupBitmapAnd

Calculations the AND of a bitmap column, return cardinality of type UInt64, if add suffix -State, then return bitmap object.

Syntax

groupBitmapAnd(expr)

Arguments

  • expr – An expression that results in AggregateFunction(groupBitmap, UInt*) type.

Return value

  • Value of the UInt64 type.

Example

DROP TABLE IF EXISTS bitmap_column_expr_test2;

CREATE TABLE bitmap_column_expr_test2(tag_id String,z AggregateFunction(groupBitmap, UInt32)) ENGINE = CnchMergeTree ORDER BY tag_id;

INSERT INTO bitmap_column_expr_test2 VALUES ('tag1', bitmapBuild(cast([1,2,3,4,5,6,7,8,9,10] as Array(UInt32))));
INSERT INTO bitmap_column_expr_test2 VALUES ('tag2', bitmapBuild(cast([6,7,8,9,10,11,12,13,14,15] as Array(UInt32))));
INSERT INTO bitmap_column_expr_test2 VALUES ('tag3', bitmapBuild(cast([2,4,6,8,10,12] as Array(UInt32))));

SELECT groupBitmapAnd(z) FROM bitmap_column_expr_test2 WHERE like(tag_id, 'tag%');
┌─groupBitmapAnd(z)─┐
│               3   │
└───────────────────┘

SELECT arraySort(bitmapToArray(groupBitmapAndState(z))) FROM bitmap_column_expr_test2 WHERE like(tag_id, 'tag%');
┌─arraySort(bitmapToArray(groupBitmapAndState(z)))─┐
│ [6,8,10]                                         │
└──────────────────────────────────────────────────┘

groupBitmapOr

Calculations the OR of a bitmap column, return cardinality of type UInt64, if add suffix -State, then return bitmap object. This is equivalent to groupBitmapMerge .

Syntax

groupBitmapOr(expr)

Arguments

  • expr – An expression that results in AggregateFunction(groupBitmap, UInt*) type.

Returned value

  • Value of the UInt64 type.

Example

DROP TABLE IF EXISTS bitmap_column_expr_test2;
CREATE TABLE bitmap_column_expr_test2(tag_id String,z AggregateFunction(groupBitmap, UInt32)) ENGINE = CnchMergeTree ORDER BY tag_id;

INSERT INTO bitmap_column_expr_test2 VALUES ('tag1', bitmapBuild(cast([1,2,3,4,5,6,7,8,9,10] as Array(UInt32))));
INSERT INTO bitmap_column_expr_test2 VALUES ('tag2', bitmapBuild(cast([6,7,8,9,10,11,12,13,14,15] as Array(UInt32))));
INSERT INTO bitmap_column_expr_test2 VALUES ('tag3', bitmapBuild(cast([2,4,6,8,10,12] as Array(UInt32))));

SELECT groupBitmapOr(z) FROM bitmap_column_expr_test2 WHERE like(tag_id, 'tag%');

┌─groupBitmapOr(z)─┐
│             15   │
└──────────────────┘

SELECT arraySort(bitmapToArray(groupBitmapOrState(z))) FROM bitmap_column_expr_test2 WHERE like(tag_id, 'tag%');
┌─arraySort(bitmapToArray(groupBitmapOrState(z)))─┐
│ [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]           │
└─────────────────────────────────────────────────┘

groupBitmapXor

Calculations the XOR of a bitmap column, return cardinality of type UInt64, if add suffix -State, then return bitmap object.

Syntax

groupBitmapOr(expr)

Arguments

  • expr – An expression that results in AggregateFunction(groupBitmap, UInt*) type.

Returned value

  • Value of the UInt64 type.

Example

DROP TABLE IF EXISTS bitmap_column_expr_test2;
CREATE TABLE bitmap_column_expr_test2(tag_id String, z AggregateFunction(groupBitmap, UInt32)) ENGINE = CnchMergeTree ORDER BY tag_id;


INSERT INTO bitmap_column_expr_test2 VALUES ('tag1', bitmapBuild(cast([1,2,3,4,5,6,7,8,9,10] as Array(UInt32))));
INSERT INTO bitmap_column_expr_test2 VALUES ('tag2', bitmapBuild(cast([6,7,8,9,10,11,12,13,14,15] as Array(UInt32))));
INSERT INTO bitmap_column_expr_test2 VALUES ('tag3', bitmapBuild(cast([2,4,6,8,10,12] as Array(UInt32))));

SELECT groupBitmapXor(z) FROM bitmap_column_expr_test2 WHERE like(tag_id, 'tag%');
┌─groupBitmapXor(z)─┐
│              10   │
└───────────────────┘

SELECT arraySort(bitmapToArray(groupBitmapXorState(z))) FROM bitmap_column_expr_test2 WHERE like(tag_id, 'tag%');

┌─arraySort(bitmapToArray(groupBitmapXorState(z)))─┐
│ [1,3,5,6,8,10,11,13,14,15]                       │
└──────────────────────────────────────────────────┘

groupUniqArray

Creates an array from different argument values. Memory consumption is the same as for the uniqExact function.

The second version (with the max_size parameter) limits the size of the resulting array to max_size elements.

Syntax

groupUniqArray(x)
groupUniqArray(max_size)(x)

Arguments

  • x — input values.

Returned value

  • array of values. Type is Array.

Example

select groupUniqArray(number) from numbers(10);
┌─groupUniqArray(number)─────────┐
│ [0, 8, 3, 9, 7, 2, 1, 6, 4, 5] │
└────────────────────────────────┘
select groupUniqArray(5)(number) from numbers(10); 
┌─groupArray(5)(number)─┐
│ [0, 1, 2, 3, 4]       │
└───────────────────────┘

histogram

Calculates an adaptive histogram. It does not guarantee precise results.

The functions uses A Streaming Parallel Decision Tree Algorithm . The borders of histogram bins are adjusted as new data enters a function. In common case, the widths of bins are not equal.

Syntax

histogram(number_of_bins)(values)

Arguments

  • number_of_bins — Upper limit for the number of bins in the histogram. The function automatically calculates the number of bins. It tries to reach the specified number of bins, but if it fails, it uses fewer bins.
  • values — Expression resulting in input values.

Returned values

  • Array of Tuples of the following format:
[(lower_1, upper_1, height_1), ... (lower_N, upper_N, height_N)]

- `lower` — Lower bound of the bin.
- `upper` — Upper bound of the bin.
- `height` — Calculated height of the bin.

Example

SELECT histogram(5)(number + 1) FROM numbers(20);
┌─histogram(5)(plus(number, 1))───────────────────────────────────────────┐
│ [(1,4.5,4),(4.5,8.5,4),(8.5,12.75,4.125),(12.75,17,4.625),(17,20,3.25)] │
└─────────────────────────────────────────────────────────────────────────┘

You can visualize a histogram with the bar function, for example:

WITH histogram(5)(rand() % 100) AS hist
SELECT
    arrayJoin(hist).3 AS height,
    bar(height, 0, 6, 5) AS bar
FROM numbers(20);

┌─height─┬─bar───┐

│  2.125 │ █▋    │

│   3.25 │ ██▌   │

│  5.625 │ ████▏ │

│  5.625 │ ████▏ │

│  3.375 │ ██▌   │

└────────┴───────┘

In this case, you should remember that you do not know the histogram bin borders.

max

Aggregate function that calculates the maximum across a group of values.

Syntax

max(values)

Arguments

  • values — input values.

Returned values

  • maximum value. Type is same as values

Example

SELECT max(number) FROM numbers(20);
┌─max(number)─┐
│ 19          │
└─────────────┘

If you need non-aggregate function to choose a maximum of two values, see greatest :

SELECT greatest(a, b) FROM table;

median

The median* functions are the aliases for the corresponding quantile* functions. They calculate median of a numeric data sample.

Functions:

  • median — Alias for quantile.
  • medianDeterministic — Alias for quantileDeterministic.
  • medianExact — Alias for quantileExact.
  • medianExactWeighted — Alias for quantileExactWeighted.
  • medianTiming — Alias for quantileTiming.
  • medianTimingWeighted — Alias for quantileTimingWeighted.
  • medianTDigest — Alias for quantileTDigest.
  • medianTDigestWeighted — Alias for quantileTDigestWeighted.
  • medianBFloat16 — Alias for quantileBFloat16.

medianDeterministic

Alias for quantileDeterministic.

medianExact

Alias for quantileExact.

medianExactWeighted

Alias for quantileExactWeighted.

medianTDigest

Alias for quantileTDigest.

medianTDigestWeighted

Alias for quantileTDigestWeighted.

medianTiming

Alias for quantileTiming.

medianTimingWeighted

Alias for quantileTimingWeighted.

quantile

Computes an approximate quantile of a numeric data sequence.

This function applies reservoir sampling with a reservoir size up to 8192 and a random number generator for sampling. The result is non-deterministic. To get an exact quantile, use the quantileExact function.

When using multiple quantile* functions with different levels in a query, the internal states are not combined (that is, the query works less efficiently than it could). In this case, use the quantiles function.

Syntax

quantile(level)(expr)

Alias: median .

Arguments

  • level — Level of quantile. Optional parameter. Constant floating-point number from 0 to 1. We recommend using a level value in the range of [0.01, 0.99] . Default value: 0.5. At level=0.5 the function calculates median .
  • expr — Expression over the column values resulting in numeric, Date, or DateTime.

Returned value

  • Approximate quantile of the specified level.
    Type:
  • Float64 for numeric data type input.
  • Date if input values have the Date type.
  • DateTime if input values have the DateTime type.

Example

SELECT quantile(number) FROM numbers(10);
┌─quantile(number)─┐
│ 4.5e+00          │
└──────────────────┘

quantileDeterministic

Computes an approximate quantile of a numeric data sequence.

This function applies reservoir sampling with a reservoir size up to 8192 and deterministic algorithm of sampling. The result is deterministic. To get an exact quantile, use the quantileExact function.

When using multiple quantile* functions with different levels in a query, the internal states are not combined (that is, the query works less efficiently than it could). In this case, use the quantiles function.

Syntax

quantileDeterministic(level)(expr, determinator)

Alias: medianDeterministic .

Arguments

  • level — Level of quantile. Optional parameter. Constant floating-point number from 0 to 1. We recommend using a level value in the range of [0.01, 0.99] . Default value: 0.5. At level=0.5 the function calculates median .
  • expr — Expression over the column values resulting in numeric data types, Date or DateTime.
  • determinator — Number whose hash is used instead of a random number generator in the reservoir sampling algorithm to make the result of sampling deterministic. As a determinator you can use any deterministic positive number, for example, a user id or an event id. If the same determinator value occures too often, the function works incorrectly.

Returned value

  • Approximate quantile of the specified level.
    Type:
  • Float64 for numeric data type input.
  • Date if input values have the Date type.
  • DateTime if input values have the DateTime type.

Example

SELECT quantileDeterministic(number, 1) FROM numbers(10);
┌─quantileDeterministic(number, 1)─┐
│ 4.5e+00                          │
└──────────────────────────────────┘

quantileExact

Exactly computes the quantile of a numeric data sequence.

To get exact value, all the passed values are combined into an array, which is then partially sorted. Therefore, the function consumes O(n) memory, where n is a number of values that were passed. However, for a small number of values, the function is very effective.

When using multiple quantile* functions with different levels in a query, the internal states are not combined (that is, the query works less efficiently than it could). In this case, use the quantiles function.

Syntax

quantileExact(level)(expr)

Alias: medianExact .

Arguments

  • level — Level of quantile. Optional parameter. Constant floating-point number from 0 to 1. We recommend using a level value in the range of [0.01, 0.99] . Default value: 0.5. At level=0.5 the function calculates median .
  • expr — Expression over the column values resulting in numeric data types, Date or DateTime.

Returned value

  • Quantile of the specified level.
    Type:
  • Float64 for numeric data type input.
  • Date if input values have the Date type.
  • DateTime if input values have the DateTime type.

Example

SELECT quantileExact(number) FROM numbers(10);
┌─quantileExact(number)─┐
│                     5 │
└───────────────────────┘

quantileExactWeighted

Exactly computes the quantile of a numeric data sequence, taking into account the weight of each element.

To get exact value, all the passed values are combined into an array, which is then partially sorted. Each value is counted with its weight, as if it is present weight times. A hash table is used in the algorithm. Because of this, if the passed values are frequently repeated, the function consumes less RAM than quantileExact. You can use this function instead of quantileExact and specify the weight 1.

When using multiple quantile* functions with different levels in a query, the internal states are not combined (that is, the query works less efficiently than it could). In this case, use the quantiles function.

Syntax

quantileExactWeighted(level)(expr, weight)

Alias: medianExactWeighted .

Arguments

  • level — Level of quantile. Optional parameter. Constant floating-point number from 0 to 1. We recommend using a level value in the range of [0.01, 0.99] . Default value: 0.5. At level=0.5 the function calculates median .
  • expr — Expression over the column values resulting in numeric, Date or DateTime.
  • weight — Column with weights of sequence members. Weight is a number of value occurrences.

Returned value

  • Quantile of the specified level.
    Type:
  • Float64 for numeric data type input.
  • Date if input values have the Date type.
  • DateTime if input values have the DateTime type.

Example

CREATE TABLE IF NOT EXISTS test.functionQuantileExactWeighted (n UInt8, val UInt8) ENGINE=CnchMergeTree ORDER BY n;
INSERT INTO test.functionQuantileExactWeighted VALUES (0,3),(1,2),(2,1),(5,4);
SELECT quantileExactWeighted(n, val) FROM test.functionQuantileExactWeighted;
┌─quantileExactWeighted(n, val)─┐
│                             1 │
└───────────────────────────────┘

quantileTDigest

Computes an approximate quantile of a numeric data sequence using the t-digest algorithm.

Memory consumption is log(n) , where n is a number of values. The result depends on the order of running the query, and is nondeterministic.

The performance of the function is lower than performance of quantile or quantileTiming. In terms of the ratio of State size to precision, this function is much better than quantile .

When using multiple quantile* functions with different levels in a query, the internal states are not combined (that is, the query works less efficiently than it could). In this case, use the quantiles function.

Syntax

quantileTDigest(level)(expr)

Alias: medianTDigest .

Arguments

  • level — Level of quantile. Optional parameter. Constant floating-point number from 0 to 1. We recommend using a level value in the range of [0.01, 0.99] . Default value: 0.5. At level=0.5 the function calculates median .
  • expr — Expression over the column values resulting in numeric data types, Date or DateTime.

Returned value

  • Approximate quantile of the specified level.
    Type:
  • Float64 for numeric data type input.
  • Date if input values have the Date type.
  • DateTime if input values have the DateTime type.

Example

SELECT quantileTDigest(number) FROM numbers(10)
┌─quantileTDigest(number)─┐
│ 4.5e+00                 │
└─────────────────────────┘

quantileTDigestWeighted

Computes an approximate quantile of a numeric data sequence using the t-digest algorithm. The function takes into account the weight of each sequence member. The maximum error is 1%. Memory consumption is log(n) , where n is a number of values.

The performance of the function is lower than the performance of quantile or quantileTiming. In terms of the ratio of State size to precision, this function is much better than quantile.

The result depends on the order of running the query and is nondeterministic.

When using multiple quantile* functions with different levels in a query, the internal states are not combined (that is, the query works less efficiently than it could). In this case, use the quantiles function.