Join us at Open House, the ClickHouse user conference, May 28-29 in San Francisco. ->->

Amazon S3

core

Using ClickHouse S3 table function, users can query S3 data as a source without requiring persistence in ClickHouse. The following example illustrates how to read 10 rows of the NYC Taxi dataset.

SELECT 
    trip_id, 
    total_amount, 
    pickup_longitude, 
    pickup_latitude, 
    dropoff_longitude, 
    dropoff_latitude, 
    pickup_datetime, 
    dropoff_datetime,
    trip_distance
FROM 
s3('https://datasets-documentation.s3.eu-west-3.amazonaws.com/nyc-taxi/trips_*.gz', 'TabSeparatedWithNames') 
LIMIT 10
SETTINGS input_format_try_infer_datetimes = 0;

To transfer data from S3 to ClickHouse, users can combine the s3 table function with INSERT statement. Let's create an empty hackernews table:

CREATE TABLE hackernews ORDER BY tuple
(
) EMPTY AS SELECT * FROM s3('https://datasets-documentation.s3.eu-west-3.amazonaws.com/hackernews/hacknernews.csv.gz', 'CSVWithNames');

This creates an empty table using the schema inferred from the data. We can then insert the first 1 million rows from the remote dataset

INSERT INTO hackernews SELECT *
FROM url('https://datasets-documentation.s3.eu-west-3.amazonaws.com/hackernews/hacknernews.csv.gz', 'CSVWithNames')
LIMIT 1000000;

Other integrations

Kafka

Airbyte

Vector

PostgreSQL

MySQL

Get started with ClickHouse Cloud for free

We'll get you started on a 30 day trial and $300 credits to spend at your own pace.

Products

Resources

Company

Join our community

Comparisons

Partners

AWS
Azure

Stay informed on feature releases, product roadmap, support, and cloud offerings!

Loading form...

Trademark Privacy Security Legal Cookie policy