clickhouse-io
Master ClickHouse Analytics and Query Optimization
Build high-performance analytical systems with ClickHouse column-oriented database. Learn proven patterns for query optimization, materialized views, and real-time data pipelines.
Download the skill ZIP
Upload in Claude
Go to Settings → Capabilities → Skills → Upload skill
Toggle on and start using
Test it
Using "clickhouse-io". Create a table for market analytics with date, market_id, volume, and trades
Expected outcome:
Creates a MergeTree table with monthly partitioning, proper ordering by date and market_id, and appropriate data types (Date, String, UInt64, UInt32) for optimal compression and query performance.
Using "clickhouse-io". Optimize a query filtering by volume then date on a large table
Expected outcome:
Reorders WHERE clause to filter by indexed columns first (date, market_id), suggests using quantile() for percentile calculations, and recommends adding appropriate projections for common filter patterns.
Using "clickhouse-io". Set up real-time aggregation for hourly metrics
Expected outcome:
Creates an AggregatingMergeTree target table with AggregateFunction columns, defines a materialized view with sumState/countState/uniqState functions, and provides the query pattern using sumMerge/countMerge/uniqMerge.
Security Audit
SafeThis skill contains documentation and code examples for ClickHouse database usage. Static analyzer flagged 86 patterns that are all false positives: backticks in markdown denote SQL code blocks (not shell execution), environment variable references are configuration examples, and system table queries are legitimate ClickHouse monitoring features. No executable code or security risks present.
Quality Score
What You Can Build
Data Engineer Building Analytics Platform
Design scalable table schemas and implement efficient data ingestion pipelines for high-volume event tracking and user analytics.
Backend Developer Optimizing Queries
Learn ClickHouse-specific query patterns to reduce latency on large datasets and implement proper indexing strategies.
Analyst Creating Real-time Dashboards
Use materialized views and pre-aggregation patterns to power sub-second dashboard queries on billions of rows.
Try These Prompts
Create a ClickHouse table schema for storing user activity events with columns for user_id, event_type, timestamp, and properties. Use the appropriate engine for deduplication and partition by month.
Review this ClickHouse query that's running slowly on 100M+ rows. Suggest optimizations for the WHERE clause, indexes, and aggregation functions: [paste query]
Create a materialized view that pre-aggregates daily active users and total events per hour from an events table. Include the target table schema and the MV definition.
Design an ETL pipeline to sync data from PostgreSQL to ClickHouse hourly. Include extraction, transformation logic, and batch insert patterns with error handling.
Best Practices
- Partition tables by time (month or day) but avoid excessive partitions that impact performance
- Order primary keys by most frequently filtered columns with highest cardinality first
- Use batch inserts instead of individual row inserts for efficient data ingestion
- Leverage materialized views for pre-aggregated metrics to achieve sub-second query latency
Avoid
- Using SELECT * instead of specifying required columns - increases I/O and memory usage
- Performing small frequent inserts instead of batching - causes excessive part creation
- Relying on FINAL modifier in queries - forces expensive data merging at query time
- Creating too many JOINs in analytical queries - denormalize data for better performance