Welcome to the April ClickHouse newsletter where we round up what’s been happening in real-time data warehouses over the last month.
This month, we have the 24.3 release, building a rate limiter, a migration from MySQL to ClickHouse story, meetup videos, and more!
Inside this issue
- Featured community member
- Upcoming events
- 24.3 release
- Storing Continuous Profiling Data in ClickHouse
- Migrating to ClickHouse: Releem's Journey
- How we Built a 19 PiB Logging Platform with ClickHouse and Saved Millions
- Building a Rate Limiter with ClickHouse
- Video Corner
- ClickHouse Cloud Updates
- Post of the month
Featured community member
This month's featured community member is Shivji kumar Jha, a Staff Engineer for Data Platforms at Nutanix.
Shiv leads a five-member team, managing and supporting Nutanix's data platform, which acts as a service for messaging, streaming, event sourcing, analytics, and time series databases. Shiv actively engages with the communities of the technologies used at Nutanix, including ClickHouse.
We recently hosted a ClickHouse meetup in Nutanix’s office in Bangalore, India. Shiv was invaluable in making this event happen, helping organize it, and acting as an MC for the evening. He recorded all the talks and uploaded them to YouTube afterward. Shiv also participated in a follow-up Q&A session on 15th April to address unanswered questions from the meetup.
Thanks for all your work Shiv and we’ll see you at the next meetup!
Upcoming events
- Copenhagen Meetup - April 23rd
- FREE ClickHouse Training - April 24th & 25th
- AWS Summit London - April 24th
- v24.4 ClickHouse Community Call - April 30th
- Bengaluru Meetup - May 4th
- AWS Summit Berlin - May 15th
- Stockholm Meetup - May 22nd
- Dubai Meetup - May 28th
24.3 release
The big feature in the 24.3 release is the analyzer being enabled by default. Analyzer is a new query analysis and optimization infrastructure that’s been in the works for a couple of years and lets you have multiple ARRAY JOIN clauses in a query, treats tuple elements like columns, handles queries with nested CTEs and sub-queries, and more.
Storing Continuous Profiling Data in ClickHouse
Coroot is an open-source tool for observability that turns observability data into actionable insights. Nikolay Sivko wrote a blog post in which he describes how they built their own storage system for profiling data based on ClickHouse. After defining continuous profiling, Nikolay takes us through the data model and gives examples of queries that check on the performance of a service.
Migrating to ClickHouse: Releem's Journey
Releem is a MySQL performance tuning tool that automatically detects performance degradation and optimizes configuration files. To do this, they collect metrics from hundreds of database servers across various operating systems and cloud solutions.
They used to store these metrics in MySQL, which started to struggle once it reached almost 5 billion records. Enter ClickHouse, which helped shrink the database size by 20 times, cut aggregation query times from 45 to 2 minutes, and reduced the page load time of the Releem dashboard by 25%.
How we Built a 19 PiB Logging Platform with ClickHouse and Saved Millions
Rory Crispin, SRE at ClickHouse, shared his experience building a platform for the logging data generated by ClickHouse Cloud. Rory takes us through key design decisions, including whether to use Kafka and structured vs unstructured logging. He also explains why the team decided to use OpenTelemetry to collect metrics and does a cost comparison of the in-house solution vs using an off-the-shelf product like Datadog.
Building a Rate Limiter with ClickHouse
If you were going to build a rate limiter, the obvious choice for storing the data would be Redis. But Brad Lhotsky, Systems and Security Administrator at Craigslist, was curious whether ClickHouse would be fit-for-purpose and used it to build a proof-of-concept. Brad shared the slides of a talk explaining how he imported data from Kafka, built a bridge from the ACL API to ClickHouse, and tested high availability, all in just one week.
Video corner
- At the New York City meetup, Adam Azzam presented how Prefect uses ClickHouse to enable real-time event drive orchestration.
- Mark Needham walked us through some of the most common aggregate function combinators and showed how and why we might use them.
- At Kubecon Europe 2024, Manish Gill discussed the challenges of auto-scaling databases in Kubernetes, using ClickHouse Cloud as a case study.
ClickHouse Cloud Updates
- Over the last 9 months, we’ve been rebuilding the UI for ClickHouse Cloud and last week, started rolling it out to everybody.
- Today, ClickPipes introduces beta support for continuous data ingestion from S3 and GCS. Let us know if you’re interested in giving this a try by replying to this email!
- Tokyo (ap-northeast-1) has been added as a new region for AWS. Sign up now.
Post of the month
Our favorite post this month was by Divyendu Singh about real-time monitoring.