Skills cc-skill-clickhouse-io
📦

cc-skill-clickhouse-io

Safe

Master ClickHouse Analytics and Query Optimization

Struggling with slow analytical queries on large datasets? Learn ClickHouse-specific patterns for column-oriented storage, query optimization, and real-time aggregations to achieve high-performance analytics.

Supports: Claude Codex Code(CC)
📊 70 Adequate
1

Download the skill ZIP

2

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

3

Toggle on and start using

Test it

Using "cc-skill-clickhouse-io". Create a table for hourly market statistics

Expected outcome:

CREATE TABLE market_stats_hourly (hour DateTime, market_id String, total_volume AggregateFunction(sum, UInt64), total_trades AggregateFunction(count, UInt32), unique_users AggregateFunction(uniq, String)) ENGINE = AggregatingMergeTree() PARTITION BY toYYYYMM(hour) ORDER BY (hour, market_id)

Using "cc-skill-clickhouse-io". Query daily active users for the last 30 days

Expected outcome:

SELECT toDate(timestamp) AS date, uniq(user_id) AS daily_active_users FROM events WHERE timestamp >= today() - INTERVAL 30 DAY GROUP BY date ORDER BY date

Using "cc-skill-clickhouse-io". Calculate trade size percentiles (median, p95, p99)

Expected outcome:

SELECT quantile(0.50)(trade_size) AS median, quantile(0.95)(trade_size) AS p95, quantile(0.99)(trade_size) AS p99 FROM trades WHERE created_at >= now() - INTERVAL 1 HOUR

Security Audit

Safe
v1 • 2/25/2026

This skill is documentation-only containing SQL queries and TypeScript code examples for ClickHouse database operations. All 87 static analyzer findings are false positives: backticks are markdown code fences not shell execution, SQL aggregation functions (uniq, sum, countMerge) are not cryptographic algorithms, and system table queries are legitimate ClickHouse monitoring features. No executable code or security risks detected.

1
Files scanned
437
Lines analyzed
0
findings
1
Total audits
No security issues found
Audited by: claude

Quality Score

38
Architecture
90
Maintainability
87
Content
23
Community
100
Security
100
Spec Compliance

What You Can Build

Data Engineer Building Analytics Platform

Design and optimize ClickHouse tables for high-volume event tracking and user analytics with materialized views for real-time dashboards.

Analyst Performing Cohort Analysis

Execute retention analysis, funnel conversion tracking, and time series queries on large datasets using ClickHouse-specific aggregation functions.

Developer Integrating Real-time Metrics

Implement streaming data ingestion and automated ETL pipelines to sync transactional data from PostgreSQL to ClickHouse for analytical workloads.

Try These Prompts

Beginner: Create a Basic Analytics Table
Help me create a ClickHouse table for storing daily market analytics data with columns for date, market_id, volume, and trade count. Use the appropriate engine and partitioning strategy for time-based queries.
Intermediate: Optimize a Slow Query
My ClickHouse query filtering by market_name and volume is slow on a table with 100M rows. The table is ordered by (date, market_id). Suggest optimizations and explain how to restructure the table or query for better performance.
Advanced: Design a Real-time Dashboard Backend
Design a ClickHouse schema with materialized views to power a real-time trading dashboard showing hourly volume, trade count, and unique traders per market. Include the base table, materialized view definition, and sample queries for the dashboard.
Expert: Implement User Retention Analysis
Create a ClickHouse query to calculate user retention cohorts by signup month, showing active users at day 0, day 1, day 7, and day 30 after signup. Use an events table with user_id and timestamp columns.

Best Practices

  • Partition tables by time (month or day) using DATE or DateTime columns to optimize time-range queries
  • Order keys by most frequently filtered columns with highest cardinality first to maximize index usage
  • Use batch inserts instead of individual row inserts to improve ingestion performance significantly

Avoid

  • Using SELECT * instead of specifying columns - increases I/O and memory usage unnecessarily
  • Performing small frequent inserts instead of batching - causes excessive merge operations and degrades performance
  • Relying on FINAL clause in queries - forces data merging before query execution, significantly slowing down reads

Frequently Asked Questions

What is the difference between MergeTree and ReplacingMergeTree engines?
MergeTree is the general-purpose engine for most use cases. ReplacingMergeTree automatically deduplicates rows with the same primary key during merges, useful when ingesting data from multiple sources that may produce duplicates.
How do materialized views work in ClickHouse?
Materialized views automatically process INSERT operations on source tables and populate target tables with transformed or aggregated data. They enable real-time aggregations without manual ETL jobs.
What is the optimal batch size for inserting data?
Aim for batches of 10,000 to 100,000 rows or 10-100 MB per insert. Avoid inserting more frequently than once per second per table to prevent excessive part creation.
When should I use AggregatingMergeTree?
Use AggregatingMergeTree when you need to store pre-computed aggregations that can be merged later. It requires AggregateFunction data types and state/merge functions but provides fast querying of aggregated metrics.
How can I monitor slow queries in ClickHouse?
Query the system.query_log table filtering by query_duration_ms and type='QueryFinish'. This shows execution time, rows read, bytes read, and memory usage for completed queries.
Does ClickHouse support transactions?
ClickHouse does not support traditional ACID transactions. It is optimized for analytical workloads with append-heavy operations. Use atomic INSERT operations and design schemas to handle eventual consistency.

Developer Details

File structure

📄 SKILL.md