Another month goes by, which means itβs time for another release!
ClickHouse version 24.11 contains 9 new features π¦ 15 performance optimizations βΈοΈ 68 bug fixes ποΈ
In this release, parallel hash join becomes the default join strategy, WITH FILL
gets a STALENESS
modifier, you can pre-warm the marks cache, and vector search gets faster with the BFloat16
data type.
New Contributors
As always, we send a special welcome to all the new contributors in 24.11! ClickHouse's popularity is, in large part, due to the efforts of the community that contributes. Seeing that community grow is always humbling.
Below are the names of the new contributors:
0xMihalich, Max Vostrikov, Payam Qorbanpour, Plasmaion, Roman Antonov, Romeo58rus, Zoe Steinkamp, kellytoole, ortyomka, qhsong, udiz, yun, Γrjan Fors, ΠΠ½Π΄ΡΠ΅ΠΉ
Hint: if youβre curious how we generate this listβ¦ here.
You can also view the slides from the presentation.
Parallel hash join is the default join strategy
Contributed by Nikita Taranov
The parallel hash join algorithm is now the default join strategy, replacing hash join.
The parallel hash join algorithm is a variation of a hash join that splits the input data to build several hash tables concurrently in order to speed up the join at the expense of higher memory overhead. You can see a diagram of the algorithm's query pipeline below:
You can learn more about parallel hash join in the ClickHouse Joins Under the Hood - Hash Join, Parallel Hash Join, Grace Hash Join blog post.
As well as becoming the default, a performance optimization was done to the algorithm where blocks scattered between threads for parallel processing now use zero-copy instead of copying block columns each time.
STALENESS Modifier For ORDER BY WITH FILL
Contributed by Mikhail Artemenko
This release introduces the STALENESS
clause to WITH FILL
. Letβs look at how to use it with help from the MidJourney dataset. Assuming weβve downloaded the Parquet files, we can populate a table using the following queries:
1CREATE TABLE images (
2 id String,
3 timestamp DateTime64,
4 height Int64,
5 width Int64,
6 size Int64
7
8)
9ENGINE = MergeTree
10ORDER BY (size, height, width);
11
12
13INSERT INTO images WITH data AS (
14 SELECT
15 assumeNotNull(timestamp) AS ts,
16 assumeNotNull(id) AS id,
17 assumeNotNull(height) AS height,
18 assumeNotNull(width) AS width,
19 assumeNotNull(size) AS size,
20 parseDateTime64BestEffort(ts) AS ts2
21 FROM file('data/0000{00..55}.parquet')
22)
23SELECT id, ts2 AS timestamp, height, width, size
24FROM data;
Letβs say we want to count the number of images generated during one second on the 24th of March 2023. Weβll define start and end dates using parameters:
1SET param_start = '2023-03-24 00:24:02',
2 param_end = '2023-03-24 00:24:03';
We can then write this query to compute the count per 100 milliseconds using the WITH FILL
clause to populate empty buckets with a zero value:
1SELECT
2 toStartOfInterval(timestamp, toIntervalMillisecond(100)) AS bucket,
3 count() AS count, 'original' as original
4FROM MidJourney.images
5WHERE (timestamp >= {start: String}) AND (timestamp <= {end: String})
6GROUP BY ALL
7ORDER BY bucket ASC
8WITH FILL
9FROM toDateTime64({start:String}, 3)
10TO toDateTime64({end:String}, 3) STEP toIntervalMillisecond(100);
βββββββββββββββββββbucketββ¬βcountββ¬βoriginalββ
β 2023-03-24 00:24:02.000 β 0 β β
β 2023-03-24 00:24:02.100 β 0 β β
β 2023-03-24 00:24:02.200 β 0 β β
β 2023-03-24 00:24:02.300 β 3 β original β
β 2023-03-24 00:24:02.400 β 0 β β
β 2023-03-24 00:24:02.500 β 0 β β
β 2023-03-24 00:24:02.600 β 1 β original β
β 2023-03-24 00:24:02.700 β 1 β original β
β 2023-03-24 00:24:02.800 β 2 β original β
β 2023-03-24 00:24:02.900 β 0 β β
βββββββββββββββββββββββββββ΄ββββββββ΄βββββββββββ
This release introduces the STALENESS
clause. From the documentation:
When
STALENESS const_numeric_expr
is defined, the query will generate rows until the difference from the previous row in the original data exceedsconst_numeric_expr
.
You canβt use STALENESS
at the same time as the WITH FILL...FROM
clause, so weβll need to remove that, which leaves us with this query:
1SELECT
2 toStartOfInterval(timestamp, toIntervalMillisecond(100)) AS bucket,
3 count() AS count, 'original' as original
4FROM MidJourney.images
5WHERE (timestamp >= {start: String}) AND (timestamp <= {end: String})
6GROUP BY ALL
7ORDER BY bucket ASC
8WITH FILL
9TO toDateTime64({end:String}, 3) STEP toIntervalMillisecond(100);
Removing the WITH FILL...FROM
clause means that our result set will start from the first actual value rather than pre-filling with 0
s back to the specified timestamp.
βββββββββββββββββββbucketββ¬βcountββ¬βoriginalββ
β 2023-03-24 00:24:02.300 β 3 β original β
β 2023-03-24 00:24:02.400 β 0 β β
β 2023-03-24 00:24:02.500 β 0 β β
β 2023-03-24 00:24:02.600 β 1 β original β
β 2023-03-24 00:24:02.700 β 1 β original β
β 2023-03-24 00:24:02.800 β 2 β original β
β 2023-03-24 00:24:02.900 β 0 β β
βββββββββββββββββββββββββββ΄ββββββββ΄βββββββββββ
If we now add a STALENESS
value of 200 milliseconds, it will only fill in empty rows until the difference from the previous row exceeds 200 milliseconds:
1SELECT
2 toStartOfInterval(timestamp, toIntervalMillisecond(100)) AS bucket,
3 count() AS count, 'original' as original
4FROM MidJourney.images
5WHERE (timestamp >= {start: String}) AND (timestamp <= {end: String})
6GROUP BY ALL
7ORDER BY bucket ASC
8WITH FILL
9TO toDateTime64({end:String}, 3) STEP toIntervalMillisecond(100)
10STALENESS toIntervalMillisecond(200);
βββββββββββββββββββbucketββ¬βcountββ¬βoriginalββ
β 2023-03-24 00:24:02.300 β 3 β original β
β 2023-03-24 00:24:02.400 β 0 β β
β 2023-03-24 00:24:02.600 β 1 β original β
β 2023-03-24 00:24:02.700 β 1 β original β
β 2023-03-24 00:24:02.800 β 2 β original β
β 2023-03-24 00:24:02.900 β 0 β β
βββββββββββββββββββββββββββ΄ββββββββ΄βββββββββββ
We lose the following row from the result set:
β 2023-03-24 00:24:02.500 β 0 β β
Exceptions in the HTTP interface
Contributed by Sema Checherinda
The HTTP interface can now reliably detect errors even after the result has been streamed to the client. In previous versions if we ran the following query against the ClickHouse Server:
1curl http://localhost:8123/?output_format_parallel_formatting=0 -d "SELECT throwIf(number > 100000) FROM system.numbers FORMAT Values"
Weβd see a stream of values followed by this error message appended at the end:
Code: 395. DB::Exception: Value passed to 'throwIf' function is non-zero: while executing 'FUNCTION throwIf(greater(number, 100000) :: 2) -> throwIf(greater(number, 100000)) UInt8 : 1'. (FUNCTION_THROW_IF_VALUE_IS_NON_ZERO) (version 24.3.1.465 (official build))
The exit code is 0, which suggests the query has run successfully. From 24.11, weβll instead see the following output:
Code: 395. DB::Exception: Value passed to 'throwIf' function is non-zero: while executing 'FUNCTION throwIf(greater(__table1.number, 100000_UInt32) :: 0) -> throwIf(greater(__table1.number, 100000_UInt32)) UInt8 : 1'. (FUNCTION_THROW_IF_VALUE_IS_NON_ZERO) (version 24.11.1.2557 (official build))
curl: (18) transfer closed with outstanding read data remaining
And we have a non-zero code of 18.
Prewarming the Mark cache
Contributed by Anton Popov
Marks map primary keys to offsets in every column's file, forming part of a tableβs index. There is one mark file per table column. They can take considerable memory and are selectively loaded into the mark cache.
From 24.11, you can pre-warm this cache using the mark_cache_prewarm_ratio
setting, which is set to 95% by default.
The server eagerly brings marks to the cache in memory on every insert, merge, or fetch of data parts until it is almost full.
A new system command, SYSTEM PREWARM MARK CACHE t,
will immediately load all marks into the cache.
BFloat16 data type
Contributed by Alexey Milovidov
The Bfloat16 data type was developed at Google Brain to represent vector embeddings. As the name suggests, it consists of 16 bitsβa sign bit, an 8-bit exponent, and then 7 bits for the mantissa/fraction.
It has the same exponent range as Float32 (single precision float), with fewer bits for the mantissa (7 bits instead of 23).
This data type is now available in ClickHouse and will help with AI and vector searches. Youβll need to configure the following setting to use the new type:
SET allow_experimental_bfloat16_type=1;
We ran the nearest neighbor search with a full scan over 28 million 384-dimensional vectors on a single machine, AWS c7a.metal-48xl, and saw the following results:
clickhouse-benchmark --query "WITH
[-0.02925783360957671,-0.03488947771094666,...,0.032484047621093616]::Array(BFloat16)
AS center SELECT d FROM (SELECT cosineDistance(vector, center) AS d
FROM hackernews_llama_memory ORDER BY d LIMIT 10
) SETTINGS allow_experimental_bfloat16_type = 1"
BFloat16: 0.061 sec, 301 GB/sec.
Float32: 0.146 sec, 276 GB/sec.