Articles by tag: clickhouse
ClickHouse skipping indexes: bloom, set, minmax
How ClickHouse skipping indexes speed up queries on columns outside ORDER BY. Breakdown of minmax, set, bloom_filter, ngrambf_v1, tokenbf_v1 with examples from gaming and EXPLAIN.
ClickHouse Materialized Views: triggers on INSERT
How MV work in ClickHouse: incremental aggregation, chains minute→hour→day, Null+Kafka pattern, POPULATE and its dangers. Examples with SummingMergeTree and AggregatingMergeTree.
Special ClickHouse Engines: When MergeTree Is Not Needed
Overview of ClickHouse Memory, Buffer, Null, Log, URL, S3, and PostgreSQL Engines. Examples for coefficient cache, buffering inserts from Kafka, and live data from external databases.
Dictionaries in ClickHouse: fast lookup without JOIN
How to use ClickHouse dictionaries to replace JOIN with microsecond in-memory lookup. Types: flat/hashed/range, data sources, dictGet, and examples for gambling.
TTL in ClickHouse: data lifecycle management
How TTL in ClickHouse automatically deletes, moves to HDD/S3, aggregates, and anonymizes data. Examples for GDPR, tiered storage, and grouping old records.
ORDER BY and PRIMARY KEY in ClickHouse: index selection
How to correctly choose ORDER BY in ClickHouse: sparse index, column cardinality, equality vs range, verification via EXPLAIN. Rules and examples for gambling.
Partitioning in ClickHouse: Strategies and Operations
How partitioning in ClickHouse accelerates DROP and data management. Choosing partition size, system.parts, DETACH/ATTACH, FREEZE, MOVE to SSD/HDD, and a script for deleting old data.
CollapsingMergeTree in ClickHouse: update without UPDATE
How CollapsingMergeTree and VersionedCollapsingMergeTree replace UPDATE in ClickHouse: sign column, collapsing pairs, SUM(amount*sign), ordering problem and solution via versions.
SummingMergeTree and AggregatingMergeTree in ClickHouse
Incremental aggregation in ClickHouse: how SummingMergeTree and AggregatingMergeTree speed up dashboards by 100 times. Examples, materialized views, pitfalls, and comparison.
ReplacingMergeTree in ClickHouse: Complete Guide
Learn how ReplacingMergeTree removes duplicates, works with versions and FINAL. Examples, pitfalls, and comparison with CollapsingMergeTree for advanced deduplication.
ClickHouse: why columnar DBMS accelerate analytics by 100 times
We explain using real benchmarks and a betting scheme: ClickHouse vs PostgreSQL and MySQL. Architecture, query examples, use cases. Read 5 minutes.
ClickHouse Materialized Views: Nuances and Solutions
How MV Works in ClickHouse? Critical Differences from Classical DBMS, UPDATE/DELETE Limitations and Best Practices. Learn How to Avoid Design Errors.
ClickHouse: deduplication and losses in MV
Analysis of data losses in ClickHouse due to block deduplication in materialized views. Settings insert_deduplicate=0 and deduplicate_blocks_in_dependent_materialized_views=1. Configure lossless storage — read the details.
Kafka Engine ClickHouse: atomicity without losses
Setting up Kafka Engine in ClickHouse for reliable inserts from streams. Demonstration of offset-commit, avoiding losses on failures. Guide for middle/senior dev.
ClickHouse with Airflow instead of PostgreSQL for Big Data
Learn why Airflow + ClickHouse is displacing PostgreSQL in analytics. Performance comparison, examples for data engineers. Switch to columnar DB for ETL pipelines.
CTE in ClickHouse: macro instead of optimization
Breaking down why WITH in ClickHouse executes multiple times and how to replace with temporary tables. Code examples, EXPLAIN, comparisons for developers. Speed up queries without pitfalls.
CPU 80% Diagnostics in ClickHouse
Tools for finding problematic queries in ClickHouse: system.processes, query_log, EXPLAIN. Diagnostics steps, SQL examples, checklist. Optimize load without downtime.