Skip to main content
Skip to main content

Reference

Supported data sources

NameLogoTypeStatusDescription
Apache KafkaStreamingStableConfigure ClickPipes and start ingesting streaming data from Apache Kafka into ClickHouse Cloud.
Confluent CloudStreamingStableUnlock the combined power of Confluent and ClickHouse Cloud through our direct integration.
Redpanda
StreamingStableConfigure ClickPipes and start ingesting streaming data from Redpanda into ClickHouse Cloud.
AWS MSKStreamingStableConfigure ClickPipes and start ingesting streaming data from AWS MSK into ClickHouse Cloud.
Azure Event HubsStreamingStableConfigure ClickPipes and start ingesting streaming data from Azure Event Hubs into ClickHouse Cloud.
WarpStreamStreamingStableConfigure ClickPipes and start ingesting streaming data from WarpStream into ClickHouse Cloud.

Supported data formats

The supported formats are:

Supported data types

Standard

The following standard ClickHouse data types are currently supported in ClickPipes:

  • Base numeric types - [U]Int8/16/32/64, Float32/64, and BFloat16
  • Large integer types - [U]Int128/256
  • Decimal Types
  • Boolean
  • String
  • FixedString
  • Date, Date32
  • DateTime, DateTime64 (UTC timezones only)
  • Enum8/Enum16
  • UUID
  • IPv4
  • IPv6
  • Time, Time64
  • JSON
  • all ClickHouse LowCardinality types
  • Map with keys and values using any of the above types (including Nullables)
  • Tuple and Array with elements using any of the above types (including Nullables, one level depth only)
  • SimpleAggregateFunction types (for AggregatingMergeTree or SummingMergeTree destinations)

Variant type support

ClickPipes supports the Variant type in the following circumstances:

  • Avro Unions. If your Avro schema contains a union with multiple non-null types, ClickPipes will infer the appropriate variant type. Variant types are not otherwise supported for Avro data.
  • JSON fields. You can manually specify a Variant type (such as Variant(String, Int64, DateTime)) for any JSON field in the source data stream. Complex subtypes (arrays/maps/tuples) are not supported. In addition, because of the way ClickPipes determines the correct variant subtype to use, only one integer or datetime type can be used in the Variant definition - for example, Variant(Int64, UInt32) is not supported.

JSON type support

ClickPipes support the JSON type in the following circumstances:

  • Avro Record and Protobuf Message fields can always be assigned to a JSON column.
  • Avro String and Bytes fields can be assigned to a JSON column if the Avro field actually contains JSON String objects.
  • Protobuf string and bytes Kinds can be assigned to a JSON column if the Protobuf field actually contains JSON String objects.
  • JSON fields that are always a JSON object can be assigned to a JSON destination column.

Note that you will have to manually change the destination column to the desired JSON type, including any fixed or skipped paths.

Avro

Supported Avro Data Types

ClickPipes supports all Avro Primitive and Complex types, and all Avro Logical types except local-timestamp-millis and local_timestamp-micros. Avro record types are converted to Tuple, array types to Array, and map to Map (string keys only). In general the conversions listed here are available. We recommend using exact type matching for Avro numeric types, as ClickPipes does not check for overflow or precision loss on type conversion. Alternatively, all Avro types can be inserted into a String column, and will be represented as a valid JSON string in that case.

Nullable types and Avro unions

Nullable types in Avro are defined by using a Union schema of (T, null) or (null, T) where T is the base Avro type. During schema inference, such unions will be mapped to a ClickHouse "Nullable" column. Note that ClickHouse does not support Nullable(Array), Nullable(Map), or Nullable(Tuple) types. Avro null unions for these types will be mapped to non-nullable versions (Avro Record types are mapped to a ClickHouse named Tuple). Avro "nulls" for these types will be inserted as:

  • An empty Array for a null Avro array
  • An empty Map for a null Avro Map
  • A named Tuple with all default/zero values for a null Avro Record

Protobuf

Supported Protobuf Data Types

ClickPipes supports all Protobuf version 2 and 3 types (except the long deprecated proto 2 group type). Basic conversions are identical to those used for the ClickHouse Protobuf format listed here. We recommend exact type matching for Protobuf numeric types, as type conversion can result overflows or precision loss. Protobuf maps, arrays, and Nullable variations of basic types are also supported. ClickPipes also recognizes a limited set of Google "well known types": Timestamp, Duration, and "wrapper" messages. Timestamps can be accurately mapped to DateTime or DateTime64 types, Durations to Time or Time64 types, and wrapper messages to the underlying type. All Protobuf types can also be mapped to a ClickHouse String column and will be represented by a JSON string in that case.

Protobuf One-Ofs

During schema inference, protobuf "One Of" special fields will normally be mapped to a named Tuple, where only one of the fields will have a "non-default" value. Alternatively, some "One Ofs" may be automatically mapped to a name variant field with the name of the "One Of", and a value representing using one of the valid types of the constituent fields. Alternatively, each "One Of" constituent field can be manually mapped to a ClickHouse column, where only one of the constituent fields will ever be populated during processing.

Message Lists (Envelopes)

If the top level Protobuf schema defined for the ClickPipe contains a single repeated field that is itself a protobuf Message, schema inference and column mapping will be based on the "contained" Message field. The Kafka message will be processed as a list of such messages, and a single Kafka message will generate multiple ClickHouse rows.

Kafka virtual columns

The following virtual columns are supported for Kafka compatible streaming data sources. When creating a new destination table virtual columns can be added by using the Add Column button.

NameDescriptionRecommended Data Type
_keyKafka Message KeyString
_timestampKafka Timestamp (Millisecond precision)DateTime64(3)
_partitionKafka PartitionInt32
_offsetKafka OffsetInt64
_topicKafka TopicString
_header_keysParallel array of keys in the record HeadersArray(String)
_header_valuesParallel array of headers in the record HeadersArray(String)
_raw_messageFull Kafka MessageString

Note that the _raw_message column is only recommended for JSON data. For use cases where only the JSON string is required (such as using ClickHouse JsonExtract* functions to populate a downstream materialized view), it may improve ClickPipes performance to delete all the "non-virtual" columns.