CapnProto
Input | Output | Alias |
---|---|---|
✔ | ✔ |
Description
The CapnProto
format is a binary message format similar to the Protocol Buffers
format and Thrift, but not like JSON or MessagePack.
CapnProto messages are strictly typed and not self-describing, meaning they need an external schema description. The schema is applied on the fly and cached for each query.
See also Format Schema.
Data Types Matching
The table below shows supported data types and how they match ClickHouse data types in INSERT
and SELECT
queries.
CapnProto data type (INSERT ) | ClickHouse data type | CapnProto data type (SELECT ) |
---|---|---|
UINT8 , BOOL | UInt8 | UINT8 |
INT8 | Int8 | INT8 |
UINT16 | UInt16, Date | UINT16 |
INT16 | Int16 | INT16 |
UINT32 | UInt32, DateTime | UINT32 |
INT32 | Int32, Decimal32 | INT32 |
UINT64 | UInt64 | UINT64 |
INT64 | Int64, DateTime64, Decimal64 | INT64 |
FLOAT32 | Float32 | FLOAT32 |
FLOAT64 | Float64 | FLOAT64 |
TEXT, DATA | String, FixedString | TEXT, DATA |
union(T, Void), union(Void, T) | Nullable(T) | union(T, Void), union(Void, T) |
ENUM | Enum(8/16) | ENUM |
LIST | Array | LIST |
STRUCT | Tuple | STRUCT |
UINT32 | IPv4 | UINT32 |
DATA | IPv6 | DATA |
DATA | Int128/UInt128/Int256/UInt256 | DATA |
DATA | Decimal128/Decimal256 | DATA |
STRUCT(entries LIST(STRUCT(key Key, value Value))) | Map | STRUCT(entries LIST(STRUCT(key Key, value Value))) |
- Integer types can be converted into each other during input/output.
- For working with
Enum
in CapnProto format use the format_capn_proto_enum_comparising_mode setting. - Arrays can be nested and can have a value of the
Nullable
type as an argument.Tuple
andMap
types also can be nested.
Example Usage
Inserting and Selecting Data
You can insert CapnProto data from a file into ClickHouse table by the following command:
$ cat capnproto_messages.bin | clickhouse-client --query "INSERT INTO test.hits SETTINGS format_schema = 'schema:Message' FORMAT CapnProto"
Where the schema.capnp
looks like this:
struct Message {
SearchPhrase @0 :Text;
c @1 :Uint64;
}
You can select data from a ClickHouse table and save them into some file in the CapnProto
format using the following command:
$ clickhouse-client --query = "SELECT * FROM test.hits FORMAT CapnProto SETTINGS format_schema = 'schema:Message'"
Using autogenerated schema
If you don't have an external CapnProto
schema for your data, you can still output/input data in CapnProto
format using autogenerated schema.
For example:
SELECT * FROM test.hits
FORMAT CapnProto
SETTINGS format_capn_proto_use_autogenerated_schema=1
In this case, ClickHouse will autogenerate CapnProto schema according to the table structure using function structureToCapnProtoSchema and will use this schema to serialize data in CapnProto format.
You can also read CapnProto file with autogenerated schema (in this case the file must be created using the same schema):
$ cat hits.bin | clickhouse-client --query "INSERT INTO test.hits SETTINGS format_capn_proto_use_autogenerated_schema=1 FORMAT CapnProto"
Format Settings
The setting format_capn_proto_use_autogenerated_schema
is enabled by default and is applicable if format_schema
is not set.
You can also save the autogenerated schema to a file during input/output using setting output_format_schema
.
For example:
SELECT * FROM test.hits
FORMAT CapnProto
SETTINGS
format_capn_proto_use_autogenerated_schema=1,
output_format_schema='path/to/schema/schema.capnp'
In this case, the autogenerated CapnProto
schema will be saved in file path/to/schema/schema.capnp
.