embedding-strategies
Optimize Embedding Models for Semantic Search
Choosing the right embedding model and chunking strategy is critical for retrieval quality. This skill provides templates and best practices for implementing high-quality vector search pipelines.
Download the skill ZIP
Upload in Claude
Go to Settings → Capabilities → Skills → Upload skill
Toggle on and start using
Test it
Using "embedding-strategies". Recommend an embedding model for a legal document search system. I need high accuracy and can use API services.
Expected outcome:
- Recommended: text-embedding-3-large (3072 dimensions) or voyage-2 (1024 dimensions)
- text-embedding-3-large: Best accuracy, handles 8191 tokens, ideal for long legal clauses
- voyage-2: Specialized for legal/code, 1024 dimensions, 4000 token limit
- Consider chunking legal documents by section headers to preserve clause context
Using "embedding-strategies". How should I chunk my technical documentation for a RAG system?
Expected outcome:
- Strategy: Use semantic chunking by headers combined with recursive character splitting
- Recommended chunk size: 512 tokens with 50 token overlap
- Preserve code examples as complete chunks
- Add context metadata linking chunks to original sections
Security Audit
Low RiskAll static findings are false positives. C2 keyword alerts triggered by hash hex strings. Weak crypto alerts from hash substrings. External command alerts from ASCII flow diagrams using arrows. Hardcoded URL alerts are legitimate documentation links. No malicious code, command execution, or data exfiltration patterns found.
Quality Score
What You Can Build
Build RAG Systems
Implement retrieval-augmented generation by selecting appropriate embedding models and chunking strategies for your document corpus.
Optimize Semantic Search
Improve search relevance by choosing embedding models matched to your content type and implementing proper chunking and preprocessing.
Create Embedding Pipelines
Build scalable pipelines that process documents, chunk content, generate embeddings, and prepare records for vector databases.
Try These Prompts
I need to choose an embedding model for my [use case: code search / multilingual documents / legal contracts]. My priorities are [priority: accuracy / cost / speed]. I have [constraints: limit on dimensions / need open source / need API access]. Recommend 3 models with rationale.
Help me implement chunking for my [data type: technical documentation / conversational data / code]. I need to handle [requirement: preserve context / maintain semantic boundaries / limit chunk size]. Provide Python code for [strategy: token-based / sentence-based / recursive character] chunking.
Create a Python pipeline that [input: processes documents from source / generates embeddings / stores in vector database]. Include [feature: batching / progress tracking / metadata handling]. Use [model: OpenAI embeddings / sentence-transformers].
My embedding-based retrieval has [problem: low recall / inconsistent results / poor precision]. My setup uses [model details]. Analyze potential causes and suggest improvements for [metric: precision at k / recall / ndcg].
Best Practices
- Match embedding model to content type: code, prose, or multilingual
- Normalize embeddings for reliable cosine similarity comparisons
- Use token overlap when chunking to preserve context across boundaries
Avoid
- Mixing different embedding models in the same index
- Ignoring token limits and truncating content mid-thought
- Skipping preprocessing, allowing noise to degrade embedding quality
Frequently Asked Questions
What embedding model should I start with?
How do I choose chunk size?
Can I use local embedding models?
How do I evaluate my embedding quality?
Should I normalize embeddings?
What preprocessing should I apply?
Developer Details
Author
wshobsonLicense
MIT
Repository
https://github.com/wshobson/agents/tree/main/plugins/llm-application-dev/skills/embedding-strategiesRef
main
File structure
📄 SKILL.md