Articles by tag: deduplication
ReplacingMergeTree in ClickHouse: Complete Guide
Learn how ReplacingMergeTree removes duplicates, works with versions and FINAL. Examples, pitfalls, and comparison with CollapsingMergeTree for advanced deduplication.
Backup failure due to GIF duplicates in Discourse
How one popular GIF caused a backup failure in Discourse due to ext4 limitations. Problem analysis and solution for developers.
ClickHouse: deduplication and losses in MV
Analysis of data losses in ClickHouse due to block deduplication in materialized views. Settings insert_deduplicate=0 and deduplicate_blocks_in_dependent_materialized_views=1. Configure lossless storage — read the details.
maxpack: deduplication of versioned data
maxpack provides up to 50x compression on versioned projects like Node.js, CPython. Benchmarks, comparisons with tar+zstd, 7z. Test on your own datasets for IT specialists.