A/B tests often fail due to poor design, premature stopping, and invalid metrics. This skill enforces rigorous methodology with mandatory gates for hypothesis locking, metric definition, and sample size calculation before any test runs.
スキルZIPをダウンロード
Claudeでアップロード
設定 → 機能 → スキル → スキルをアップロードへ移動
オンにして利用開始
テストする
「ab-test-setup」を使用しています。 Help me set up an A/B test for our checkout page
期待される結果:
- Step 1: Hypothesis Lock - Present your final hypothesis including: target audience, primary metric, expected direction of effect, and Minimum Detectable Effect (MDE). Ask: Is this the final hypothesis we are committing to?
- Step 2: Assumptions Check - List assumptions about traffic stability, user independence, metric reliability, and randomization quality.
- Step 3: Test Type Selection - Choose A/B (default), A/B/n, Multivariate, or Split URL based on your change complexity.
- Step 4: Metrics Definition - Define your primary metric (mandatory), secondary metrics for context, and guardrail metrics that must not degrade.
「ab-test-setup」を使用しています。 Is my hypothesis valid?
期待される結果:
- Valid hypothesis checklist:
- ✓ Observation or evidence - Do you have data supporting this?
- ✓ Single, specific change - Is the change clearly defined?
- ✓ Directional expectation - Do you expect increase or decrease?
- ✓ Defined audience - Who is being tested?
- ✓ Measurable success criteria - What defines success?
セキュリティ監査
安全All 12 static findings are false positives. The scanner detected benign A/B testing terminology (hypothesis, design, metrics, valid, peeking) and misinterpreted it as cryptographic/network security issues. This skill is a legitimate methodology guide for setting up rigorous A/B tests with statistical rigor. No actual security risks identified.
品質スコア
作れるもの
Product Manager Validates Test Design
A product manager uses the skill to structure a new feature test, ensuring hypothesis is specific and metrics are defined before engineering begins.
Data Scientist Ensures Statistical Rigor
A data scientist applies the methodology to review a proposed experiment, checking sample size calculations and guardrail metrics.
Growth Engineer Plans Conversion Test
A growth engineer uses the skill to structure a landing page optimization test, locking hypothesis and calculating required traffic before launch.
これらのプロンプトを試す
Help me set up an A/B test. I have a user problem: [describe problem]. I want to test: [describe proposed change]. Guide me through the mandatory setup steps.
Review my hypothesis for an A/B test: [paste hypothesis]. Does it meet the quality checklist? What is missing or needs improvement?
Help me calculate sample size. My current conversion rate is [X]%. I want to detect a [Y]% relative lift. Significance level 95%, power 80%. What sample size do I need?
Run an execution readiness check for my A/B test. I have: hypothesis [paste], primary metric [name], sample size [number], duration [days]. What gates am I missing?
ベストプラクティス
- Lock your hypothesis and primary metric BEFORE any implementation work begins
- Calculate sample size upfront and ensure you have enough traffic for the test duration
- Use guardrail metrics to prevent harmful wins that damage user experience
回避
- Starting a test without a frozen hypothesis - this leads to moving goalposts
- Peeking at results early and stopping tests based on initial significance
- Defining multiple primary metrics - this increases false positive risk